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Over  the  past  decade,  considerable  effort  has  been'  devoted  to  the 
analysis  and  design  of  feedback  controllers  for  systems  characterized  by 
uncertain  dynamic  descriptions,  and  both  frequency-domain  land  time-domain 
techniques  have  been  produced  by  these  studies.  In  this  thesis,  additional 
aspects  of  the  analysis  and  design  problems  are  investigated.  Techniques 
are  derived  to  enhance  the  frequency-domain  analysis  of  perturbed  systems 
and  to  generalize  the  time-domain  concept  of  self-tuning  control  to 

multivariable  systems. 

.  _ _ _  Lr  ,  ,  r~ 

The  first  part  of  the  thesis- addresses  the  problem  of  generating  an 
accurate  description  of  frequency  Response  uncertainty  for  systems  whose 
models  are  generated  via  system  identification.  Finite  weighting  sequence 
models  are  found  to  be  particularly  useful  for  this  purpose.  Techniques  are 
derived  to  quantify  the  variability  of  the  frequency  response  estimates 
associated  with  the  given  model,  to  identify  the  optimal  truncation  level 
for  the  specified  identification  test,  and  to  assess  the  impact  of  the  bias 
introduced  by  truncation. ^  These  results  are  used  to  establish  precise 
element-by-element  frequent  response  uncertainty  bounds  which  can,  for 
example,  be  used  —to  pToduce  characteristic  locus  inclusion  bands  for  the 
analysis  of  perturbed  system  stability  and  performance. 

The  second  part  the  thesis  considers  the  use  of  multivariable 

weighting  sequences  in  the  development  of  on-line  computer-implemented 

control  algorithms.*  "Characteristic  subsystem"  decompositions  are  derived 
for  multivariable  systems  to  establish  system  descriptions  that  are  amenable 
to  the  development  of  real-time  computer  algorithms  which  achieve  the 

desired  control  objectives  in  a  true  generaiized-Nyquist  sense.  This 

"characteristic  subsystem"  formulation  is  shown  to  produce  a  much  more 
accurate  means  of  implementing  conventional  characteristic  locus  designs. 
More  importantly,  it  is  used  to  derive  a  multivariable,  generaiized-Nyquist 
extension  of  standard  single-input/single-output  self-tuning  algorithms  for 
the  control  of  uncertain  multivariable  systems. 
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CHAPTER  ONE 
INTRODUCTION 

1.1  Historical  Background 

By  the  beginning  of  the  twentieth  century,  the  basic  analytical 
concepts  of  automatic  control  were  well  established  in  terms  of  "time- 
domain"  formulations  involving  ordinary  differential  equations  and  their 
related  characteristic  algebraic  equations.  However,  in-depth  studies  of 
automatic  control  did  not  begin  to  flourish  until  the  early  1930s,  with 
advances  in  the  area  of  long-range  communication.  Spurred  by  the 
development  of  feedback  amplifiers  and,  more  specifically,  the  peculiar 
relationship  between  changes  in  loop  gain  and  system  stability,  H.  Nyquist 
introduced  the  powerful  tools  of  complex  variable  theory  to  the  analysis  of 
feedback  system  behaviour  and  produced  his  now-celebrated  "Nyquist  stability 
criterion"  to  explain  the  observed  phenomena. 

Interest  in  a  frequency  response  approach  to  feedback  control  became 
even  more  intense  with  H.W.  Bode's  development  of  rules  for  the  optimum 
shaping  of  the  loop-gain  frequency  function  for  feedback  amplifiers  and  with 
the  introduction  of  gain/phase  vs  frequency  diagrams  (Bode  plots)  and  the 
concepts  of  gain  and  phase  margins.  The  recognized  flexibility  of  these 
frequency  response  methods  soon  led  to  the  spread  of  frequency  response 
concepts  into  other  fields  (such  as  mechanical  and  aeronautical  engineering) 
as  well.  Indeed,  the  tremendous  research  efforts  spawned  by  the  Second 
World  War  soon  produced  a  unified  and  coherent  theory  for  single-loop 
feedback  systems  based  on  these  frequency  response  concepts.  By  the  1950s, 
frequency  response  techniques  (such  as  Nyquist  diagrams,  Bode  plots,  and 
Evans'  root  locus  approach)  were  used  routinely  for  the  analysis  and  design 
of  feedback  control  systems. 

In  the  late  1950s  however,  interest  began  to  shift  away  from  the  use  of 
frequency  response  methods  in  automatic  control  due  mainly  to  the  emergence 
of  the  digital  computer  as  a  reliable  and  widely-available  engineering  tool 
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and  to  a  simultaneous  growing  interest  in  space  exploration.  With  the 
increasing  availability  of  tremendous  computing  power  and  speed,  it  was  now 
becoming  possible  to  address  much  more  difficult  control  problems  (such  as 
the  simultaneous  control  of  several  interacting  variables)  and  to  consider 
different  controller  objectives  for  which  the  now-classical  frequency 
response  theory  was  inappropriate.  Furthermore,  the  primary  technical 
problems  of  interest  at  the  time  (those  posed  by  the  space  program)  were 
particularly  amenable  to  formulations  based  on  very  accurate  mathematical 
models  and  on  performance  objectives  which  could  be  conveniently  stated  in 
terms  of  "economic"  conditions  (e.g.  minimum  fuel  or  minimum  time).  For 
these  reasons,  attention  was  refocused  on  the  ordinary  differential  equation 
approach  to  control  system  design,  and  the  development  of  a  "state-space" 
description  for  dynamical  systems  was  soon  combined  with  advances  in  optimal 
control  theory  to  produce  an  effective  means  of  addressing  the  multivariable 
control  problem.  Indeed,  investigations  of  the  linear  optimal  control 
problem  with  a  quadratic  performance  index  using  a  state-space  framework 
produced  the  well-known  Linear  Quadratic  (LQ)  state  feedback  theory;  a 
theory  which  established  a  powerful  synthesis  tool  for  the  solution  of 
multivariable  control  problems.  In  addition,  a  precise  duality  between  the 
control  problem  and  the  filtering  (or  state  estimation)  problem  was  soon 
recognized  and  used  to  derive  state-space  techniques  for  extracting  state 
estimates  from  corrupted  measurements.  Ultimately,  this  led  to  the  Linear 
Quadratic  Gaussian  (LQG)  optimal  control  methodology  which  became  the 
foundation  for  the  state-space  treatment  of  multivariable  control  problems. 

While  state-space  and  optimal  control  techniques  proved  to  be 
particularly  useful  in  aerospace  applications,  it  was  soon  recognized  that 
these  same  techniques  were  inappropriate  for  many  industrial  applications. 
For  instance,  the  mathematically-precise  system  models  available  for 
aerospace  systems  were  not  generally  available  for  many  industrial  systems. 
In  addition,  the  specification  of  control  objectives  in  terms  of  precise 
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performance  indices  was  much  less  obvious,  particularly  to  control  engineers 
who  were  well-versed  in  the  classical  frequency  response  approach  to 
control.  Furthermore,  the  dynamical  complexity  of  the  LQG  controller  was, 
in  many  cases,  not  appropriate  for  the  simple  control  tasks  to  be  performed. 
For  these  reasons,  interest  in  multivariable  extensions  of  classical 
frequency  response  methods  arose. 

Initial  attempts  to  achieve  these  extensions  focused  on  the  task  of 
selecting  a  cascaded  compensator  to  diagonalize  the  system  transfer  function 
matrix.  Once  this  was  achieved,  controller  design  could  be  completed  using 
standard  single-loop  techniques.  But,  it  was  soon  recognized  that  the 
identification  of  a  "diagonalizing"  compensator  was  complicated  and,  more 
importantly,  was  unreliable  and  unnecessary.  Indeed,  Rosenbrock,  with  the 
development  of  his  inverse  Nyquist  array  (INA)  design  methodology  [R0S1], 
used  the  concept  of  diagonal  dominance  to  demonstrate  that  cascaded 
compensators  need  only  be  introduced  to  reduce  multivariable  interaction  to 
an  acceptable  level,  thereby  permitting  the  successful  application  of 
single-loop  techniques.  However,  because  the  INA  approach  produced  only 
sufficient  conditions  for  system  stability,  an  element  of  conservatism  was 
introduced  in  the  resulting  design  process,  and  this  conservatism  generated 
a  tendency  to  overdesign  the  closed-loop  system.  In  addition,  the  selection 
of  compensators  to  achieve  diagonal  dominance  was  still  found  to  be  an  ad 
hoc  and,  in  many  instances,  daunting  task,  which  (though  vital  to  the  INA 
method)  was  not  necessary  for  good  system  performance. 

The  motivation  for  the  INA  approach  to  multivariable  control  design  was 
the  eventual  deployment  of  classical  single- input/single-output  (SIS'1) 
frequency  response  techniques  during  the  last  stage  of  the  design  study. 
However,  investigations  soon  began  to  focus  on  the  multivariable  system  in 
terms  of  a  single  entity,  the  system  transfer  function  matrix,  rather  than 
as  a  collection  of  distinct  scalar  loops  and  to  address  the  task  of 
generalizing  the  basic  concepts  of  the  classical  single-loop  approach  to 
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this  new  problem.  Again,  a  complex  variable  approach  proved  to  be 
particularly  useful  and  led  to  the  development  of  the  Characteristic  Locus 
Method  (CLM)  for  the  analysis  and  design  of  multivariable  systems  ([MAC1], 
[MAC2],  [MAC3]).  In  essence,  the  characteristic  locus  technique  reduced  the 
open-loop  multivariable  system  to  a  set  of  SISO  "subsystems"  using 
frequency-dependent  eigenvalue/eigenvector  decompositions.  This  procedure 
established  a  necessary  and  sufficient  condition  (the  generalized  Nyquist 
criterion)  for  the  assessment  of  multivariable  system  stability. 
Furthermore,  the  analyticity  of  the  eigenvalue  representations  at  almost  all 
points  in  the  complex  plane  was  shown  to  guarantee  a  one-to-one 
correspondence  between  the  distance  of  the  characteristic  loci  from  the 
(-1,0)  point  and  the  location  of  the  system's  closed-loop  poles.  As  such, 
the  direct  manipulation  of  the  characteristic  loci  using  classical  SISO 
frequency  response  methods  was  found  to  produce  effective  compensation,  and 
so  multivariable  feedback  compensators  could  be  designed  without  resorting 
to  ad  hoc  attempts  to  diagonalize  the  plant.  In  effect,  the  CLM  provided 
a  firm  theoretical  foundation  for  the  direct  application  of  classical 
frequency  response  techniques  to  the  much  more  difficult  multivariable 
control  problem. 

1.2  Recent  Developments 

Over  the  past  decade,  control  research  has  shifted  its  attention  from 
the  development  of  analysis  and  design  techniques  for  accurately-known 
systems  to  the  more  realistic  situation  where  there  is  an  element  of 
uncertainty  (e.g.  unmodelled  dynamics,  parameter  variations,  etc)  associated 
with  the  system.  Under  these  more  realistic  conditions,  examples  have  been 
generated  to  demonstrate  that,  in  certain  cases,  commonly-used  design 
techniques  (e.g.  LQG,  INA  and  CLM)  may  lead  to  unreliable  closed-loop 
designs.  These  examples  have  clearly  pointed  out  a  need  for  the  systematic 
treatment  of  model  uncertainties  in  the  development  of  analysis  techniques 
to  assess  the  robust  stability  and  performance  characteristics  of  feedback 


control  systems  and  in  the  development  of  design  techniques  to  produce 
adequate  and  robust  closed  loop  behaviour.  Recent  investigations  have,  in 
fact,  produced  important  results  in  each  of  these  areas. 

For  analysis  purposes,  a  number  of  studies  have  focused  on  frequency- 
domain  concepts  to  address  the  problem  of  model  uncertainty.  From  these 
efforts,  two  particularly  important  methods  have  emerged:  (1)  the  use  of 
frequency-dependent  singular  value  bounds  to  assess  uncertainty  tolerance 
via  the  small  gains  theorem,  and  (2)  the  development  of  characteristic  locus 
inclusion  bands  to  produce  a  direct  extension  of  the  generalized  Nyquist 
diagram  for  uncertain  systems.  Initial  investigations  by  Doyle  and  Stein 
[D0Y1J  examined  the  situation  where  system  uncertainty  is  modelled  as  norm- 
bounded  (but  otherwise  unconstrained)  perturbations  to  a  nominal  model  in 
the  frequency  domain,  and  they  derived  a  necessary  and  sufficient  singular 
value  condition  for  perturbed  system  stability.  This  condition  yields  a 
quick  assessment  of  stability  tolerances  via  an  upper  bound  on  the  size  of 
allowable  perturbations,  but  it  does  not  take  explicit  account  of  the  phase 
information  required  to  assess  robust  performance  using  a  generalized- 
Nyquist  approach.  In  an  attempt  to  retain  this  phase  information,  a  number 
of  efforts  have  focused  on  the  development  of  eigenvalue  inclusion  bands  for 
the  same  class  of  unstructured  perturbations  [DAN1],  [DAN2].  These  efforts 
have  led  to  the  "E-contour”  method  proposed  by  Daniel  and  Kouvaritakis 
[ DAN3 ]  which  produces  exact  inclusion  regions  for  the  characteristic  loci  of 
the  perturbed  system  and,  hence,  establishes  a  necessary  and  sufficient 
stability  condition  via  the  generalized  Nyquist  criterion  while  simul¬ 
taneously  providing  both  gain  and  phase  information  on  the  perturbed  system. 

The  desire  to  extend  these  analysis  results  to  the  more  likely 
situation  where  structural  information  on  the  perturbation  is  available  has 
also  yielded  a  number  of  useful  results.  For  example,  Safonov  [SAFI]  has 
examined  the  class  of  diagonal  perturbations  for  the  purpose  of  stipulating 
robust  stability  margins,  while  Doyle  [D0Y2]  has  examined  the  same  problem 
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for  the  more  general  class  of  block-structured  perturbations.  In  both 
cases,  the  use  of  similarity  scaling  techniques  to  exploit  uncertainty 
structure  was  found  to  yield  results  superior  to  those  obtained  using  an 
unstructured  uncertainty  approach.  Indeed,  the  stability  results  derived  by 
Doyle  are  known  to  be  necessary  and  sufficient  for  the  case  of  up  to  three 
blocks  in  the  perturbation  matrix,  and  hence  provide  an  exact  assessment  of 
perturbed  system  stability  for  these  situations.  The  block-structured 
results  can  also  be  applied  to  the  case  where  a  known  upper  bound  on  the 
magnitude  of  the  frequency  response  uncertainty  associated  with  each  element 
of  the  transfer  function  matrix  is  available.  However,  the  complexity  of 
the  algorithm  used  to  compute  the  stability  bound  and  the  potential 
conservatism  in  the  resulting  stability  assessment  have  prompted  other 
researchers  to  investigate  this  element-by-element  structured  uncertainty 
problem  in  more  detail.  Results  derived  by  Kantor  and  Andres  [KAN1],  Lunze 
[LUN1],  and  Owens  and  Chotai  [0WE1)  each  yield  an  explicit  and  easy-to- 
compute  stability  condition  for  this  problem.  But  because  these  conditions 
rely  on  spectral  radius  results,  the  stability  assessment  may  be  very 
conservative.  Kouvaritakis  and  Latchman  have  also  examined  this  problem  in 
detail  ([K0U4],  [K0U5J),  and  they  have  demonstrated  that  optimal 
nonsimilarity  scaling  can  be  used  to  produce  an  exact  stability  assessment 
for  the  element-by-element  problem.  Moreover,  these  results  have  been 
extended  to  the  more  general  block-structured  perturbation  problem  [K0U7], 
[ DAN4 ] .  In  addition,  although  these  results  rely  on  singular  value 
conditions,  they  have  also  been  used  to  extend  the  E-contour  results  to  the 
case  of  structured  perturbations  [K0U4J,  (K0U6).  Hence,  both  singular  value 
and  eigenvalue  inclusion  techniques  are  now  available  to  assess  system 
robustness  for  the  much  more  realistic  problem  of  structured  uncertainties. 

Frequency-domain  techniques  have  also  proven  to  be  popular  for  the 
development  of  robust  design  procedures.  Indeed,  using  frequency  response 
uncertainty  bounds  to  characterize  performance  and  stability  limitations, 
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Doyle  and  Stein  have  proposed  an  extension  of  the  state-space  LQG  design 

approach  for  systems  with  model  uncertainties  [D0Y1],  a  procedure  which  has 

come  to  be  known  as  the  Linear  Quadratic  Gaussian  with  Loop  Transfer 

Recovery  (LQG/LTR)  design  method.  But  by  combining  state-space  and 

frequency-domain  concepts,  this  methodology  tends  to  rely  on  engineering 

insight  rather  than  a  closed-form  synthesis  procedure  to  develop  the 

appropriate  compensator.  In  search  of  a  closed-form  solution  to  the  robust 

design  problem,  research  has  focused  instead  on  the  H  approach  to  feedback 

design  proposed  originally  by  Zames  [ZAM1J.  In  general,  the  H°  methods  that 

have  been  developed  to  date  rely  on  a  unique  formulation  (the  "Youla 

parametrization" )  for  the  set  of  all  stabilizing  controllers,  Q(s),  to 

00 

obtain  the  particular  controller  which  minimizes  the  H  norm  of  an 

appropriately  weighted  combination  of  the  sensitivity  and  complementary 

sensitivity  functions  for  the  given  system.  [Note  that,  for  a  system  with 

transfer  function  matrix  G(s),  the  H°°  norm  is  given  by  sup  o{G(jw)}.]  This 

W 

CD 

H  optimization  produces  a  closed-form  solution  for  the  specified  control 
problem  which  is  guaranteed  to  be  stable  and  simultaneouly  satisfies  the 
robust  design  requirements  that  have  been  implicitly  incorporated  into  the 
problem  by  the  selection  of  appropriate  frequency-dependent  weighting. 

CO 

Indeed,  this  H  approach  addresses  the  robust  design  problem  in  much  the 

same  way  as  the  state-space/optimal  control  approach  addresses  the 

conventional  multivariable  control  problem.  As  such,  the  same  concerns 

arise.  For  example,  the  selection  of  suitable  weighting  functions  for  use 

in  the  performance  index  is  not  obvious  and  the  effects  of  changing  these 

functions  is  not  apparent.  In  addition,  the  optimal  H°°  controller  tends  to 

be  very  complex.  It  might,  therefore,  be  argued  that  alternative  design 

procedures  may  be  required  for  a  wide  variety  of  systems  where  the 

00 

complexity  of  the  H  design  approach  is  not  justified.  In  light  of  this, 
the  development  of  design  techniques  which  parallel  the  classical  frequency 
response  methods  but  which  also  cater  for  the  robustness  aspects  of  the 
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problem  may  offer  an  effective  alternative.  Some  preliminary  efforts  in 
this  area  have  been  generated  by  Daniel  and  Kouvaritakis  [DAN2],  though  a 
great  deal  of  additional  research  is  still  required.  With  the  analysis 
tools  that  are  now  available,  it  seems  reasonable  to  anticipate  that 
significant  progress  in  this  area  can  be  achieved  provided  accurate 
frequency  response  uncertainty  information  can  be  included  in  the  analysis. 

Although  the  development  of  robust  control  design  methodologies  has 
tended  to  rely  on  frequency-domain  techniques,  the  past  decade  has  also  been 
witness  to  the  development  of  an  alternate,  time-domain  approach  to  control 
design  which  caters  for  system  uncertainty.  This  approach,  commonly 
referred  to  as  self-tuning  control,  combines  an  on-line  computer-implemented 
control  algorithm  with  an  on-line  model  identification  algorithm  to  update 
the  system  description  based  on  observed  changes  in  the  input/output  data 
of  the  plant.  In  effect,  the  self-tuning  approach  to  control  design 
accounts  for  system  uncertainty  by  adjusting  the  feedback  control  algorithm 
in  real-time  based  on  available  observations.  Initiated  by  the  key  paper  of 
Astrom  and  Wittenmark  [ASTI],  research  in  this  area  has  generated  a  number 
of  successful  SISO  algorithms  including  methods  based  on  the  concepts  of 
'Generalized  Minimum  Variance'  ([CLA1],  [ CLA2 ) ) ,  pole  placement  ([VEL1], 
[AST2]),  and  long-range  predictive  control  ([CUT1],  [CLA3],  [CLA4]).  This 
success  has  also  prompted  attempts  by  a  number  of  researchers  (including 
Borisson  [ B0R1 ] ;  Goodwin  et  al  [G002],  Koivo  [K0I1],  and  Dugard  et  al 
[DUG1])  to  extend  these  SISO  self-tuning  algorithms  to  multivariable 
systems.  However,  like  initial  attempts  to  extend  classical  frequency- 
domain  techniques  to  the  multivariable  problem,  these  multivariable  efforts 
have  focused  primarily  on  producing  designs  which  completely  decouple  the 
dynamics  of  the  open-loop  plant  so  that  SISO  techniques  can  be  applied 
directly  to  the  diagonal  elements  of  the  noninteractive  system.  As  such, 
they  fail  to  account  for  the  true  multivariable  characteristics  of  the 
system  and,  as  a  result,  have  met  with  only  limited  success  in  solving  the 
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self-tuning  problem  for  multivariable  systems. 

1.3  Description  of  the  Thesis 
1.3.1  Objectives 

From  the  discussion  above,  it  is  clear  that  significant  progress  has 

been  made  towards  the  goal  of  establishing  analysis  and  design  techniques 

which  address  the  practical  problem  of  robust  control  system  design  in  a 

systematic  manner.  But,  it  is  also  apparent  that  considerable  effort  is 

still  required  to  complete  the  task.  From  a  frequency-domain  perspective, 

the  ability  to  accurately  quantify  frequency  response  uncertainty  is  a 

particularly  pressing  issue.  Indeed,  each  of  the  frequency-domain 

techniques  highlighted  above  relies,  to  a  large  extent,  on  an  accurate 

description  of  frequency  response  uncertainty.  Without  this  information, 

the  robust  stability  criteria  provide  only  an  indication  of  the  size  of 

assumed  perturbations  that  can  be  tolerated  before  stability  problems  arise. 

No  information  on  the  robust  performance  characteristics  of  the  system  can 

be  produced.  Hence,  in  the  absence  of  this  uncertainty  information,  these 

analysis  tools  can  provide  only  part  of  the  assessment  required  for  the 

00 

design  of  robust  control  systems.  For  the  LQG/LTR  and  H  design  methods 
meanwhile,  the  lack  of  accurate  frequency  response  uncertainty  information 
leaves  the  design  engineer  at  a  loss  when  attempting  to  specify  precise 
control  design  objectives.  So,  instead  of  producing  a  design  tailored  to 
the  uncertainty  characteristics  of  the  system,  these  methodologies  may  yield 
arbitrary  designs  based  on  the  specification  of  inappropriate  design 
objectives. 

Unfortunately,  while  the  development  of  frequency-domain  analysis  and 
design  tools  has  flourished,  the  task  of  generating  the  required  uncertainty 
information  has  been  virtually  ignored.  For  the  large  class  of  practical 
systems  whose  models  are  generated  via  system  identification  procedures,  it 
is,  however,  possible  to  generate  an  accurate  statistical  description  of 
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frequency  response  uncertainty.  Indeed,  some  recent  developments  (e.g. 
[LJU1),  [LJU2J,  [EDM2],  [L0H1])  have  yielded  limited  results  by  producing 

various  procedures  for  generating  statistical  information  on  the  frequency 
response  estimates  at  specified  individual  frequencies.  The  statistical 
nature  of  the  uncertainty,  however,  suggests  that  the  uncertainty 
description  at  any  one  frequency  is  dependent  upon  information  at  other 
frequencies.  Thus,  an  uncertainty  description  that  will  be  useful  within 
the  framework  of  available  analysis  and  design  methodologies  (a  framework 
which  requires  this  information  over  a  large  range  of  frequencies)  must 
quantify  both  the  statistical  uncertainty  at  individual  frequencies  and  the 
interfrequency  relationships  associated  with  this  information. 

In  the  first  part  of  the  thesis,  an  investigation  of  this  problem  is 
conducted  with  the  goal  of  developing  procedures  to  generate  a  complete 
statistical  description  of  frequency  response  uncertainty.  As  a  parametric 
approach  to  model  identification  is  adopted,  the  development  addresses  both 
the  problem  of  quantifying  the  variability  of  the  frequency  response 
estimates  associated  with  the  estimated  model  and  the  problem  of  describing 
the  bias  introduced  by  the  selection  of  a  given  model  structure.  In 
addition,  the  problem  of  characterizing  the  interf requency  dependence  of  the 
uncertainty  information  will  be  solved  by  producing  an  algorithm  which 
yields  accurate  statistical  information  simultaneously  over  the  entire 
frequency  range  of  interest.  The  results  derived  yield  not  only  an  accurate 
description  of  frequency  response  uncertainty,  but  also  a  description 
can  be  tailored  to  the  specific  frequency  response  characteristics  of  the 
system.  In  addition,  the  information  will  be  produced  in  a  format  that  is 
appropriate  for  use  in  existing  robust  analysis  and  design  methodologies. 

Following  these  developments,  attention  shifts  to  the  problem  of 
control  design  and,  more  specifically,  to  the  problem  of  developing  a 
multivariable  algorithm  that  is  suitable  for  on-line  computer 
implementations.  As  highlighted  previously,  several  such  algorithms  have 
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already  been  proposed  for  multivariable  self-tuning  applications.  For  the 
most  part  however,  these  results  have  met  with  limited  success  because,  like 
the  first  attempts  to  extend  classical  frequency  response  techniques  to  the 
multivariable  problem,  they  have  focused  on  attempts  to  decouple  the  plant 
dynamics  so  that  SISO  techniques  can  be  applied  directly.  As  suggested  by 
the  frequency-domain  developments  of  the  generalized-Nyquist  approach,  what 
is  clearly  required  to  address  this  time-domain  multivariable  problem  in  a 
direct  manner  is  the  ability  to  embed  existing  SISO  algorithms  within  a 
characteristic  locus  framework.  In  the  second  part  of  the  thesis,  the  task 
of  developing  this  capability  is  undertaken  with  the  goal  of  producing  an 
on-line  multivariable  control  algorithm  that  can  be  used  with  conventional 
frequency-domain  designs  to  yield  much  more  accurate  control  implementations 
than  those  currently  available  and,  in  addition,  is  suitable  for  self-tuning 
implementations  to  handle  the  control  problems  associated  with  uncertain 
multivariable  systems.  Weighting  sequence  models,  which  proved  useful  for 
the  frequency  response  uncertainty  developments  in  the  first  part  of  the 
thesis,  also  provide  a  direct  link  from  the  frequency  domain  to  the  time 
domain.  This  link  is  used  to  develop  a  "characteristic  subsystem" 
decomposition  for  multivariable  systems  within  which  existing  SISO 
algorithms  can  be  applied.  For  conventional  designs,  this  decomposition  is 
used  to  generate  an  "exactly"  commutative  controller  which  permits  the 
direct  modification  of  the  open-loop  characteristic  loci  simultaneously  over 
all  frequencies.  For  self-tuning  applications,  the  result  is  an  on-line 
algorithm  which  handles  the  multivariable  characteristics  of  the  system 
explicitly  within  a  generalized-Nyquist  framework. 

1.3.2  Structure 

The  material  presented  in  this  thesis  is  organized  into  nine  chapters 
according  to  the  following  general  structure: 

Chapter  1  :  Introduction 

Chapter  2  :  Summary  of  Multivariable  Frequency  Response  Methods 
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Chapters  3-6:  Identification  of  Frequency  Response  Uncertainty  Information 
Chapters  7-8:  On-Line  Multivariable  Control  Design 
Chapter  9  :  Conclusion 

Chapter  2  summarizes  the  predominant  frequency-domain  methods  for  the 
analysis  of  "certain"  and  "uncertain"  multivariable  systems.  For  "certain" 
systems,  the  generalized-Nyquist/characteristic-locus  approach  is  described, 
and  its  implications  for  the  assessment  of  closed-loop  stability  and 
performance  are  discussed.  For  "uncertain"  systems,  several  important 
analysis  techniques  are  described  including  results  based  on  both  singular 
value  bounds  and  eigenvalue  inclusion  regions.  The  results  for  the  case  of 
element-by-element  structured  uncertainty  are  of  particular  interest  and  are 
described  in  some  detail.  These  frequency-domain  concepts  provide  the 
foundation  and  motivation  for  most  of  the  subsequent  developments. 

The  problem  of  defining  a  precise  class  of  allowable  element-by-element 
perturbations  for  systems  whose  dynamic  descriptions  are  produced  via  system 
identification  techniques  is  addressed  in  Chapters  3  through  6,  and  accurate 
statistical  descriptions  of  frequency  response  uncertainty  are  derived  for 
both  scalar  and  multivariable  systems.  The  development  begins  in  Chapter  3 
with  an  investigation  of  the  variability  of  the  frequency  response  estimates 
produced  by  the  identification  process  [ CL01 ] .  A  parametric  approach  to  the 
model  identification  problem  is  adopted  and,  by  first  quantifying 
uncertainty  in  the  parameter  space  and  then  transforming  this  information 
into  the  frequency  domain,  an  accurate  frequency  response  uncertain 
description  (in  terms  of  confidence  regions  for  the  true  system  frequency 
response)  is  derived  at  each  individual  frequency.  At  the  same  time,  the 
interf requency  dependence  of  this  uncertainty  information  is  quantified  by 
producing  a  description  that  is  valid  simultaneously  over  all  frequencies. 

Finite  weighting  sequence  models  are  found  to  be  particularly  useful 
for  this  purpose  because  they  define  exact  linear  transformations  from  the 
parameter  space  to  the  frequency  domain.  The  introduction  of  these  models, 


however,  necessarily  implies  the  need  for  truncation  to  identify  the 
appropriate  model.  This  requirement  also  introduces  a  second  element  of 
uncertainty:  the  bias  introduced  by  truncation.  Although  a  number  of  order 
selection  criteria  have  been  proposed  to  solve  the  model  selection  problem 
by  establishing  an  optimal  trade-off  between  estimate  variability  and  bias, 
these  criteria  are  ineffective  for  weighting  sequence  truncation. 
Furthermore,  they  fail  to  account  for  the  frequency  response  characteristics 
of  the  system;  characteristics  which  are  particularly  important  for  the 
application  proposed  here.  For  these  reasons,  the  truncation  selection 
problem  is  examined  in  Chapters  4  and  5,  and  two  new  selection  criteria  are 
proposed . 

The  first  criterion  is  presented  in  Chapter  4.  Using  geometric 
interpretations  of  the  standard  "parameter-space"  problem,  the  truncation 
selection  problem  is  reformulated  in  terms  of  the  system  frequency  response 
characteristics  at  a  preselected  frequency  [CL02J.  An  easy-to-implement 
criterion  is  derived  to  handle  bias  implicitly  by  selecting  the  truncation 
which  establishes  an  optimal  trade-off  between  estimate-variability  and  bias 
in  the  frequency  domain.  This  criterion  establishes  the  correct  dependence 
of  truncation  level  on  frequency  and,  because  an  optimal  trade-off  between 
bias  and  variability  has  been  achieved,  the  frequency  response  confidence 
bounds  (derived  in  Chapter  3)  will  be  accurate  at  the  specified  frequency. 
However,  the  identified  truncation  is  frequency-specific.  As  such,  accurate 
uncertainty  information  will  only  be  available  over  a  limited  range  of 
frequencies.  So,  the  ability  to  quantify  the  interfrequency  dependence  of 
this  information  will  be  severely  restricted.  * 

To  retain  information  on  the  interdependence  of  the  uncertainty 
descriptions  at  individual  frequencies  (CL03J,  an  alternative  criterion  is 
derived  in  Chapter  5.  First,  a  statistical  definition  of  the  optimal  model 
order  (based  on  maximum  likelihood  arguments)  is  presented.  Theoretically, 
this  optimal  model  order  can  be  identified  using  Akaike's  Information 
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Theoretic  Criterion  [AKA1].  However,  practical  implementations  of  this 
criterion  are  found  to  require  "correction  factors"  to  ensure  that  the 
proper  model  is  selected.  For  weighting  sequence  truncation,  appropriate 
corrections  are  identified  and  used  to  establish  an  alternative, 
implementable  criterion.  This  criterion  not  only  identifies  the  optimal 
truncation,  it  also  quantifies  the  bias  associated  with  this  truncation. 
Thus,  by  establishing  appropriate  frequency-domain  extensions,  this 
criterion  is  shown  to  produce  explicit  bounds  on  the  frequency  response  bias 
introduced  by  truncation  at  each  and  every  frequency. 

In  Chapter  6,  the  results  of  Chapters  3  and  5  are  consolidated  and  used 
to  produce  a  complete  description  of  frequency  response  uncertainty 
(including  the  effects  of  both  estimate-variability  and  bias).  Techniques 
are  also  developed  to  optimize  this  uncertainty  description  based  on  the 
frequency  response  characteristics  of  the  system.  Furthermore,  it  is  shown 
that  the  procedures  derived  for  SISO  systems  can  be  readily  extended  to 
multivariable  systems  to  establish  an  accurate  element-by-element 
description  of  frequency  response  uncertainty.  Finally,  this  element-by¬ 
element  uncertainty  description  is  combined  with  the  structured  uncertainty 
analysis  techniques  summarized  in  Chapter  2  to  generate  bounds  for  the 
characteristic  loci  of  the  perturbed  system,  and  the  implications  of  the 
resulting  uncertainty  description  are  discussed. 

An  investigation  of  the  multivariable  control  design  problem  begins  in 
Chapter  7  with  the  derivation  of  a  system  description  which  transforms  ilu 
fundamental  concepts  of  the  frequency-domain,  characteristic-locus  design 
philosophy  into  a  time-domain  format  and  which  permits  the  development  of 
on-line  control  algorithms  in  a  true  generalized-Nyquist  sense  [CT04|. 
Again,  weighting  sequence  models  are  found  to  be  particularly  useful  in 
establishing  the  required  link  between  the  frequency  domain  and  the  time 
domain.  More  specifically,  it  is  shown  that  the  'characteristic  sequences' 
method  [K0U2]  generates  a  time-domain  (weighting  sequence)  description  for 


the  eigenstructure  of  a  multivariable  system.  This  sequence  representation 
is  used  to  establish  an  alternative  z-domain  "characteristic  subsystem" 
description  for  the  system,  and  the  properties  of  this  subsystem  description 
are  investigated.  Although  it  is  found  that,  in  some  instances,  this  z- 
domain  description  may  not  be  suitable  for  control  design  purposes, 
conditions  which  guarantee  the  utility  of  the  representation  are  identified 
and  techniques  to  ensure  the  availability  of  suitable  representations  are 
proposed.  The  utility  of  the  computer  implementations  based  on  this 
"characteristic  subsystem"  methodology  is  then  demonstrated  using  a 
conventional  characteristic-locus/commutative-controller  design  technique, 
and  it  is  shown  that  these  new  implementations  produce  much  more  accurate 
results  than  those  currently  in  use. 

In  Chapter  8,  a  multivariable  self-tuning  algorithm  is  proposed,  which 
incorporates  existing  SISO  self-tuning  algorithms  into  the  now-available 
"characteristic  subsystem"  framework  [CL05].  The  SISO  Generalized 
Predictive  Control  algorithm  proposed  by  Clarke  et  al  [CLA3]  is  found  to  be 
particularly  useful  for  this  purpose  because  it  explicitly  uses  a  finite 
number  of  weighting  sequence  elements  to  derive  the  desired  control  law.  As 
such,  it  can  be  readily  generalized  to  the  multivariable  problem  using  the 
sequence  representations  developed  previously.  Several  implementation 
considerations  (unique  to  the  multivariable  problem)  are  also  highlighted, 
and  the  resulting  algorithm  is  shown  (via  simulation)  to  produce  effective 
control.  Finally,  the  problem  of  on-line  identification  is  considered,  nnd 
two  algorithms  are  proposed  for  the  identification  of  the  "characteristic 
subsystem"  descriptions  directly  from  input/output  data.  It  is  suggested 
that,  using  these  algorithms,  both  the  output  prediction  and  control 
calculation  tasks  can  be  performed  entirely  as  scalar  operations,  thereby 
reducing  the  computational  complexity  of  the  overall  self-tuning  algorithm. 

The  thesis  concludes  in  Chapter  9  with  a  brief  summary  and  proposals 


for  further  research. 


1.3.3  Notation 


The  following  notational  conventions  will  be  used  throughout  this 


thesis 

unless  explicitly  redefined 

elsewhere: 

SISO  : 

Single-Input /Single-Out put 

MIMO  : 

Multi -Input /Multi -Out put 

min  : 

minimum 

sup  : 

supremum 

max  : 

maximum 

inf  : 

inf imum 

R  :  the  field  of  real  numbers 

Rn  :  the  n-dimensional  vector  space  defined  over  R 

R+  :  the  set  of  non-negative  numbers 

C  :  the  field  of  complex  numbers 

Cnxm  :  the  set  of  matrices  with  n  rows  and  m  columns  with  elements  in  C 

|x|  :  absolute  value  of  the  scalar  x 

arg(x)  :  argument  of  the  complex  scalar  x 

x*  :  the  transpose  of  the  vector  x 

e^  :  the  i1*1  standard  basis  vector  (i.e.  the  i**1  column  of  the  identity 
matrix) 

XY  :  the  vector  defined  from  point  X  to  point  Y 

|  |x  |  |  :  the  two-norm  of  the  vector  x,  defined  by  /x*x 

T 

A  :  the  transpose  of  the  matrix  A 

A(A)  :  an  eigenvalue  of  the  matrix  A 

p(A)  :  the  spectral  radius  of  the  matrix  A,  defined  by  max  | (A)  | 

a(A)  :  a  singular  value  of  the  matrix  A 

ff(A)  :  the  maximum  singular  value  of  the  matrix  A 

a( A)  :  the  minimum  singular  value  of  the  matrix  A 

tr  [AJ  :  the  trace  of  the  matrix  A 

det  (A]  1 

l  :  the  determinant  of  the  matrix  A 

|A|  / 

A+  :  a  matrix  with  all  elements  replaced  by  their  absolute  values 
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T  :  the  sample  interval  for  a  discrete-time  system 

y(kT)  =  y(k)  :  the  time-domain  quantity  y,  defined  at  sample  time  kT 

s 

g  :  a  sequence  of  quantities  defined  at  discrete  points  in  time  (e.g. 
Sg  =  lg(0),  g(l),  •••  ]) 

a  :  a  specified  level  of  confidence  (between  0  and  1) 

P{S}  :  the  probability  associated  with  the  occurrence  of  the  event  S 

E{x)  :  the  expected  value  of  the  random  variable  x 
2 

ctx  :  the  variance  of  the  scalar  random  variable  x 
Additional  notation  will  be  defined  as  required. 


17 


CHAPTER  TWO 


FREQUENCY  DOMAIN  ANALYSIS  OF  MULTIVARIABLE  SYSTEMS 

As  highlighted  in  the  introduction,  the  implementation  of  frequency 
response  methods  for  the  analysis  and  design  of  multivariable  feedback 
control  systems  has  staged  a  dramatic  comeback  over  the  past  two  decades. 
This  chapter  summarizes  relevant  aspects  of  the  most  important  developments 
associated  with  the  frequency-domain  analysis  of  both  "certain" 
(unperturbed)  and  "uncertain"  (perturbed)  multivariable  systems. 

2.1  Analysis  of  Unperturbed  Systems:  The  Characteristic  Locus  Method 

The  generalized-Nyquist/characteristic-locus  method  proposed  by 
MacFarlane  and  Belletrutti  [MAC1],  MacFarlane  and  Postlethwaite  [MAC21  and 
MacFarlane  and  Kouvaritakis  [MAC3]  is  a  particularly  useful  tool  for  the 
analysis  and  design  of  unperturbed  systems.  Among  the  most  important 
reasons  for  this  are  the  following: 

(i)  It  establishes  necessary  and  sufficient  conditions  for  the  stability 
of  the  closed-loop  system. 

(ii)  It  reduces  the  multivariable  design  problem  to  a  set  of  SISO  problems 
on  which  classical  frequency  response  analysis  and  design  concepts 
can  be  applied. 

The  foundation  for  this  frequency-domain  method  is  the  theory  of  algebraic 
functions  [BLI1J  which  makes  it  possible  to  relate  a  scalar  algebraic 
function,  g(s),  to  an  m  x  m  matrix-valued  function  of  a  complex  variable, 
G(s),  whose  elements  are  rational  functions  in  s.  More  specifically,  this 
theory  can  be  used  to  define  a  set  of  characteristic  gain  functions,  g^(s), 
which  satisfy  the  characteristic  equation: 

k 

A(s,g)  =  det{g(s)I  -  G(s)}  =  n  &.(s,g)  =  0  ....(2.1) 

j  =  l  3 

where  each  Aj(s,g)  is  a  polynomial  in  g(s)  of  order  z^  (with  coefficients 
{otjj(s);  i=0,  *  -  • , z j ) )  which  is  irreducible  over  the  field  of  rational 


functions  and  each  one  of  the  functions  g^(s)  is  associated  with  a  single 
Aj(s,g).  Although  eqn  2.1  defines  the  most  general  expression  for  the 
characteristic  equation,  for  most  practical  systems  the  polynomial 
A(s,g)  =  det{g(s)I  -  G(s)}  is  itself  irreducible.  So  for  simplicity  in  the 
following  presentation,  it  will  be  assumed  that  there  is  a  single 
characteristic  gain  function,  g(s),  defined  by  the  solution  of: 

A(s,g)  =  det{g(s)I  -  G(s)}  =  am(s)  gm  +  •••  +  a0(s)  =  0 . (2.2) 

Since  A(s,g)  is  irreducible,  the  characteristic  gain  will  be  multi¬ 
valued  over  the  field  of  complex  numbers,  C.  Hence,  closed  curves  in  the 
complex  s-plane  may  not  map  to  closed  curves  in  the  g(s)-plane.  To  overcome 
this  difficulty,  g(s)  must  be  defined  on  an  m-sheeted  Riemann  surface 
consisting  of  m  copies  of  the  complex  plane  appropriately  joined  together  so 
that  g(s)  is  single-valued  in  this  domain  [ MAC2 ] .  The  generalization  of  the 
SISO  frequency  response  plot  follows  immediately  by  mapping  m  copies  of  the 
Nyquist  D-contour  (one  on  each  of  the  m  copies  of  the  complex  plane 
described  above)  under  the  characteristic  gain  function,  g(s). 

A  practical  alternative  to  the  Riemann  surface  development  summarized 
above  is  provided  by  the  fact  that,  almost  everywhere  on  the  complex  plane, 
the  m  roots  of  eqn  2.2  (i.e.  the  eigenvalues  of  G(s))  form  a  set  of  locally- 
distinct  analytic  branches  (g^(s);  i=l,***,m)  which  will  be  referred  to  as 
the  eigenfunctions  of  G(s).  The  generalized  frequency  response  plot  can, 
therefore,  be  generated  as  the  composite  plot  of  the  set  of  loci  traced  out 
by  the  eigenvalues  of  G(s)  as  s  traverses  a  single  copy  of  the  Nyquist 
contour.  The  loci  generated  in  this  manner  are  called  the  characteristic 
loci  of  the  system  and,  for  practical  systems,  the  characteristic  i 
(possibly  after  appropriate  pairing  of  positive  and  negative  frequencies) 
will  form  closed  curves  so  that  it  is  convenient  to  think  of  them  as  m 
distinct  plots.  Furthermore,  to  each  eigenfunction,  g^(s),  of  G(s),  there 
correspond  two  vector-valued  functions  of  s,  w^(s)  and  v^(s),  which  satisfy 
the  standard  eigenvalue/eigenvector  relationships  for  G(s);  namely, 
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G(s)  w^s)  =  ( s )  wd(s)  v^(s)  G(s)  =  gi(s)  v^(s) 

Vj(s)  w,(s)  =  1  ,  Vi  Vj(s)  vk(s)  =  0  ,  V  i,k  i  t  k 

These  vector-valued  functions  will  be  referred  to  as  the  characteristic 
directions  (or  eigenvectors)  and  the  dual  characteristic  directions  (or  dual 
eigenvectors)  of  G(s),  respectively. 

Using  the  definitions  above,  G(s)  may  be  rewritten  in  terms  of  the 
eigenfunctions,  eigenvectors,  and  dual  eigenvectors  as: 

m  t 

G(s)  =  V(s)  A(s)  V(s)  =  l  g.(s)  w.(s)  v  (s)  ....(2.3) 

i=l  1  1  1 

In  addition,  the  closed-loop  transfer  function  R(s)  =  {[I  +  G(s)]-*  G(s)} 
can  be  related  to  the  open-loop  eigenfunctions  and  characteristic  directions 
in  the  following  way: 

_i  m  Ki<s>  t 

R(s)  =  W(s)  {[I  +  A(s)  1  A(s)}  V(s)  =  L  - w.  (s)  v|(s)  ....(2.4) 

i=l  1  +  g.(s) 

Eqn  2.4  clearly  highlights  two  particularly  noteworthy  characteristics. 
First,  the  eigenvector  structures  of  the  open-loop  and  closed-loop  systems 
are  identical.  Second,  the  open-loop  and  closed-loop  eigenfunctions 
maintain  the  classical  open-  to  closed-loop  relationship  typically 
associated  with  SISO  systems.  These  properties  can,  in  fact,  be  used  to 
relate  the  frequency  response  characteristics  of  the  open-loop 
eigenfunctions  and  characteristic  directions  to  the  stability  and 
performance  characteristics  of  the  closed-loop  system  as  described  below. 

2.1.1  Stability  Assessment 

For  SISO  systems,  the  return  difference  operator,  f(s)  =  1  +  g(s),  is 
related  to  the  closed-loop  poles  of  the  system  by: 

P  (s) 

f(s)  =  1  +  g(s)  =  -^(-s—  ....(2.5) 

where  Pc(s)  and  pQ(s)  are  the  closed-loop  and  open-loop  pole  polynomials 
of  the  system  respectively.  This  relationship,  when  used  in  conjunction 
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with  the  principle  of  the  argument,  leads  to  the  well-known  Nyquist 
stability  criterion  for  STSO  systems. 

A  relationship  similar  to  eqn  2.5  can  also  be  established  for 
multivariable  systems.  For  proper  systems,  this  relationship  is: 

Pc<s> 

det{F(s)}  =  det{I  +  G(s)}  =r  ....(2.6) 

where  F(s)  is  the  multivariable  counterpart  of  f(s)  and  y  is  a  constant. 
Indeed,  eqn  2.6  may  be  used  with  the  principle  of  the  argument  to  establish 
the  closed-loop  stability  of  multivariable  systems.  Unfortunately,  no 
convenient  relationships  between  det{F(s)}  and  det{G(s)}  exist,  so  this 
result  cannot  be  used  to  generate  the  same  simple  relationship  between  open- 
loop  frequency  response  and  closed-loop  stability  that  is  available  for  SISO 
systems.  However,  eqn  2.6  may  be  reduced  to  a  more  recognizable  form  using 
the  eigenfunctions  of  F(s)  and  G(s).  In  particular,  noting  that  the 
eigenfunctions,  f^(s),  of  F(s)  can  be  defined  in  the  same  manner  as  the 
eigenfunctions  of  G(s),  a  standard  result  from  matrix  algebra  can  be  used  to 
rewrite  eqn  2.6  as: 

m  m  p  (s) 

det{F(s)}  =  n  f  (s)  =  n  [1  +  g.(s)]  =  y  -  ....(2.7) 

i=l  1  i=l  1  po^  ; 

Eqn  2.7  is  now  in  a  form  which  enables  the  application  of  a  generalized 
principle  of  the  argument  to  establish  the  generalized  Nyquist  criterion.  A 
rigorous  derivation  of  this  criterion  is  given  in  [MAC2].  It  suffices  here, 
however,  to  note  that  the  characteristic  loci  of  F(s)  form  a  set  of  closed 
curves  on  the  complex  plane  which  can  be  used  to  investigate  closed-loop 
stability.  In  addition,  the  characteristic  loci  of  G(s)  may  be  related  to 
those  of  F(s)  (for  purposes  of  this  stability  investigation)  by  shifting 
attention  from  the  origin  to  the  critical  point,  [-1,0],  in  the  complex 
plane.  Hence,  the  characteristic  loci  of  G(s)  emerge  as  the  appropriate 
medium  for  defining  the  generalized  Nyquist  criterion  which  may  be  stated  as 
follows: 
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The  multivariable  system  defined  by  G(s)  (with  no  unstable  modes  which  are 
uncontrollable  and/or  unobservable)  will  be  stable  under  unity  feedback 
if,  and  only  if,  the  net  sum  of  counterclockwise  encirclements  of  the 
critical  point  1-1,0]  by  the  characteristic  loci  of  G(s)  is  equal  to  the 
number  of  open-loop  unstable  poles  of  G(s). 

2.1.2  Performance  Evaluation 

The  relationships  identified  above  can  be  used  not  only  to  establish 
the  multivariable  generalization  of  the  Nyquist  stability  criterion,  but 
also  (because  of  the  analytic  nature  of  the  eigenfunctions,  g^(s))  to  assess 
the  closed-loop  performance  of  the  system.  In  particular,  the  same  one-to- 
one  relationship  between  the  proximity  of  g(jw)  to  the  [-1,0]  point  and  the 
location  of  the  closed-loop  poles  that  holds  for  SISO  systems  also  applies 
directly  to  the  characteristic  loci  of  multivariable  systems.  As  a  result, 
the  classical  frequency-domain  concepts  of  bandwidth  and  gain/phase  margins 
can  be  applied  to  the  open-loop  characteristic  loci  to  assess  the  transient 
response  characteristics  (i.e.  rise  time,  settling  time,  percent  overshoot, 
etc.)  of  the  closed-loop  system.  Because  of  the  additional  multidimensional 
aspects  of  the  problem  however,  there  are  other  considerations  which  must 
also  be  addressed. 

As  shown  in  [MAC3],  these  additional  topics  can  be  highlighted  by 
considering  the  response  of  the  closed-loop  system  to  a  reference  input 
vector  sinusoid  represented  by  means  of  the  input  phasor  vector  r(jco).  Front 
eqn  2.4,  closed-loop  system  output  for  this  specific  input  is  given  by: 

m  g.  (jw) 

y(j«)  =  Z  - ~ -  {v.(j«)  r(j«)}  w.(jto)  - (2.8) 

i=l  1  +  g.(ja>)  1 

Eqn  2.8  clearly  demonstrates  that  system  output  can  be  expressed  as  the  sum 
of  m  components,  each  of  which  lies  along  a  characteristic  direction, 
W|(jw),  and  is  modulated  by  the  corresponding  frequency-dependent 
eigenfunction  of  the  closed-loop  system,  g^(j»)/{l+g^(jw)} .  As  such, 
closed-loop  accuracy  and  multivariable  interaction  may  be  analyzed  in  terms 
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of  the  eigenfunctions  and  characteristic  directions  of  the  system.  In 
particular,  when  the  moduli  of  all  the  frequency-dependent  eigenfunctions, 
g^(jw),  are  sufficiently  large,  g^(jw)/{l+g^(j«)}  =  1  and  eqn  2.8  reduces 
to: 

m  t 

y(ju)  =  Z  {w.(jw)  v.(jw)}  r(jw)  =  r(jw) 
i=l  1  1 

So,  large  moduli  lead  to  good  accuracy.  This  relationship  also  suggests 
that  the  response  of  the  jt^1  output  to  the  i1*1  input  (for  j  t  i)  is 
negligible.  Hence,  large  moduli  will  also  act  to  suppress  interaction. 
Alternatively,  when  large  gains  cannot  be  achieved  (due  to  stability 
considerations  or  power  limitations,  for  example),  low  interaction  can  be 
achieved  by  aligning  the  characteristic  directions  to  the  standard  basis 
vectors.  When  this  is  accomplished,  an  input  in  the  direction  of  the  i**1 
standard  basis  vector,  e..  ,  produces  an  output  which  is  itself  in  the 
direction  of  e^  (as  demonstrated  by  eqn  2.8).  Hence,  interaction  is  again 
suppressed  and  an  appropriate  level  of  accuracy  can  be  achieved  by 
manipulation  of  the  appropriate  g^jw). 

The  observations  above  can  be  used  to  identify  the  following  set  of 
conditions  which  must  be  satisfied  to  ensure  satisfactory  closed-loop 
performance  in  terms  of  stability  and  low  interaction: 

(i)  The  set  of  characteristic  loci,  (g..(jw),  j=l,...,m},  must  satisfy  the 
generalized  Nyquist  stability  criterion. 

(ii)  The  characteristic  loci  must  all  have  sufficiently  high  gain  over  a 
desired  operating  bandwidth  to  provide  the  required  accuracy  for  the 
tracking  of  a  reference  input  vector  by  the  output  vector. 

(iii)  The  high  frequency  alignment  of  the  characteristic  directions  of 
G(s)  to  the  standard  basis  direction  set  should  be  sufficiently  good 
to  keep  high  frequency  interaction  within  acceptable  limits. 

Once  these  conditions  have  been  satisfied,  the  eigenvalue,  g^(jw)  associated 
with  the  eigenvector,  Wj(jco),  which  aligns  best  with  the  j**1  standard  basis 
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vector, 


e j ,  can  be  regarded  as  a  good  practical  approximation  to  the 

t  h  t  h 

transference  between  the  j  input  and  j  output  at  the  specified 

frequency.  Thus,  an  assessment  of  loop-by-loop  performance  (i.e.  the 

^  h  1 1 

closed-loop  response  of  the  j  output  to  the  j  input)  can  be  based  on  an 
inspection  of  the  appropriate  characteristic  locus  and  classical  frequency 
response  concepts  may  be  applied  to  ascertain  the  dynamic  performance 
characteristics  of  each  individual  loop.  Finally,  control  designs  based  on 
the  manipulation  of  the  characteristic  loci  can  be  used  to  achieve  the 
desired  response. 

2.2  Analysis  of  Perturbed  Systems 

For  practical  situations,  the  errors  inherent  in  describing  any 
physical  system  by  a  finite-dimensional  linear  model  introduce  uncertainties 
which  cannot  be  handled  explicitly  in  the  design  process.  Yet  adequate 
control  designs  must  incorporate  features  that  allow  the  closed-loop  system 
to  function  properly  despite  these  uncertainties.  Although  common 
multivariable  design  algorithms  (including  the  characteristic  locus  method) 
work  exceptionally  well  in  the  absence  of  uncertainty,  examples  have  been 
generated  to  demonstrate  the  sensitivity  of  these  methods  to  changes  in  the 
nominal  plant.  Due  to  this  sensitivity,  designs  based  on  a  specific  nominal 
plant  model  may  not  produce  reliable  closed- loop  behaviour  in  the  presence 
of  uncertainty. 

To  investigate  the  adequacy  of  any  control  system  design,  ' * 
therefore  necessary  to  assess  not  only  the  nominal  behaviour  of  the  plant 
but  also  the  robustness  characteristics  associated  with  this  behaviour. 
As  mentioned  previously,  interest  in  this  aspect  of  feedback  control  hac 
prompted  a  great  deal  of  research  over  the  past  several  years  (e.g.  [ I LM1 ] , 
[ I SR 1 J ) ,  and  many  interesting  and  useful  results  have  been  produced.  A 
necessary  first  step  in  developing  analysis  techniques  to  assess  system 
robustness  is  the  development  of  a  mathematical  formulation  of  the 
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perturbation  problem.  Two  widely-used  uncertainty  descriptions  for  this 
purpose  are  given  by  additive  and  multiplicative  perturbation';  to  a  nominal 
transfer  function  matrix,  G,  as  shown  here: 


Gp(s)  =  G(s)  ♦  A(s)  G  (s)  =  [I  +  D(s) J  G(s)  ....(2.9a,b) 
Since  the  developments  for  both  representations  are  similar,  only  additive 
perturbations  will  be  considered  further  here.  Simple  extensions  of  the 
following  results  to  the  case  of  multiplicative  perturbations  can  be 
obtained  by  making  appropriate  substitutions  in  the  development. 

Given  the  additive  perturbation  representation  above,  the  use  of 
frequency  response  characteristics  to  assess  the  robust  stability  properties 
of  multivariable  systems  has  proven  to  be  particularly  effective.  Under  the 
assumption  that  the  number  of  unstable  open-loop  poles  of  Gp(s)  and  G(s)  are 
identical,  an  extension  of  the  generalized  Nyquist  criterion  to  the  analysis 
of  perturbed  systems  clearly  suggests  that  the  closed-loop  system  will 
remain  stable  if,  and  only  if,  the  nominal  characteristic  loci  of  the  system 
satisfy  the  generalized  Nyquist  criterion  and  none  of  the  nominal 
characteristic  loci  can  be  deformed  so  as  to  pass  through  the  [-1,0]  point 
as  G  is  warped  continuously  toward  G  .  This  condition,  in  turn,  is 
satisfied  if,  and  only  if, 


II  +  G  +  A|  *  0;  V  A 


_ (2.10) 


But  condition  2.10  can  be  rewritten  as: 


| I  +  G  +  &|  =  |I  +  G |  | I  +  (I  +  G)  &|  *  0 

and,  since  |I  +  G|  *  0,  this  result  implies  the  following  spectral  radius 


condition: 


max  |A. [ (I+G)  A] 
i 


>[  (I+G)  ^  A|  <1;  V  A  _ (2.11) 


Conditions  2.10  and  2.11  provide  necessary  and  sufficient  conditions  for  the 
guaranteed  stability  of  the  perturbed  closed-loop  system  and  form  the  basis 
for  many  of  the  robust  stability  criteria  that  have  been  proposed  to  date. 
Effective  criteria,  however,  require  more  than  the  simple  uncertainty 


formulations  defined  by  eqns  2.9a,b.  An  accurate,  more-detailed 
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mathematical  description  of  the  class  of  allowable  perturbations  is  also 
required;  a  description  that  must  be  sufficiently  simple  to  yield  useful 
analytical  results,  yet  complex  enough  to  restrict  the  class  to  only  those 
perturbations  that  can  actually  occur.  Several  such  descriptions  are 
presented  below,  and  the  robustness  criteria  arising  from  these  descriptions 
are  summarized. 


2.2.1  The  Case  of  Unstructured  Perturbations 

A  large  number  of  studies  have  focused  on  the  relatively  simple  case 
where  the  only  available  information  on  A(s)  is  an  upper  bound  on  its 
maximum  singular  value  at  each  frequency.  This  class  of  "unstructured" 
perturbations  can  be  described  by: 


Dy  =  { A(  s ) :  oU(s)J  <  &(s)  e  R+;  Vs  =  jw)  (2.12) 

Using  a  singular  value  analysis,  Doyle  and  Stein  (DOYl)  have  shown  that, 

when  the  nominal  system  is  closed-loop  stable  and  the  number  of  unstable 

poles  of  G  and  are  identical,  the  perturbed  system  will  remain  stable  for 
all  allowable  perturbations  if,  and  only  if, 

ct(I  +  G(s)  I  >  a(A(s)]  Vs  =  j»  - (2.13) 

This  result  follows  directly  from  inequality  2.11  since 

p[ (I+G)'1  A]  <  a(  (I+G)'1  A]  <  o| (I+G)'1 )  o[A]  - (2.14) 

and  it  can  be  shown  that  there  exists  at  least  one  A  e  such  that  the 
equalities  in  eqn  2.14  hold.  Hence,  a  necessary  and  sufficient  condition 
for  perturbed  system  stability  is: 

a[  (I+G)-1 ]  a [ A]  <  1 

and  condition  2.13  follows  immediately.  Other  singular  value  critmin 
(based  on  an  inverse  Nyquist  test)  have  been  proposed  for  the  Ins':  liK 


situation 
the  same 


where  the  number  of  unstable  open-loop  poles  of  G  and  G  are  not 
and  when  information  on  the  size  of  the  perturbation  associated 


with  the  inverse  plant  is  available  [P0S1],  but  these  criteria  will  not  be 


discussed  in  detail  here. 
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Singular  value  criteria  such  as  the  one  identified  by  condition  2.13 
provide  an  immediate  assessment  of  the  magnitude  of  uncertainty  that  can  he 
tolerated  while  maintaining  closed-loop  stability.  However,  these  criteria 
focus  on  system  gain  information  while  failing  to  account  directly  for  any 
available  phase  information.  Yet,  as  suggested  by  the  characteristic  locus 
method,  the  eigenvalues  of  the  perturbed  system  may  also  be  used  to  assess 
the  robust  performance  of  the  system  provided  appropriate  phase  information 
can  be  generated.  Towards  this  goal,  several  results  ( [DAN1 ]-[DAN3] , [HAR1J) 
have  been  produced  to  establish  characteristic  locus  inclusion  bands  based 
on  the  assumption  of  unstructured  uncertainties. 

A  useful  starting  point  for  the  development  of  characteristic  locus 
bounds  is  the  Bauer-Fike  eigenvalue  perturbation  result  [WIL1]  given  by: 

|X(G  )  -  X(G)  |  <  c  5  - (2.15) 

where  c  is  the  condition  number  of  the  nominal  eigenvector  matrix  W  defined 
by  <j(W)/a(W) .  This  result  can  be  used  to  construct  circular  regions  in  the 
complex  plane  which  contain  the  perturbed  system  characteristic  loci. 
However,  in  situations  where  the  eigenvectors  of  G  are  skew,  these  regions 
may  be  excessively  large  and  may  also  be  extremely  sensitive  to  small 
changes  in  the  nominal.  To  overcome  these  problems,  Daniel  and  Kouvaritakis 
( [DAN1] , [DAN2J )  have  suggested  the  use  of  normal  approximations  to  the 
nominal  plant  and  have  demonstrated  the  improvements  in  both  size  and 
sensitivity  that  can  be  achieved  using  this  approach.  Furthermore,  the 
normal  approximation  bounds  can  be  refined  still  further  by  combining  them 
with  information  on  the  numerical  range  of  G  [ DAN2 ]  and  by  implementing 
similarity  scaling  techniques  (HAR1}  to  produce  eigenvalue  inclusion  regions 
which  are  easy  to  compute  and  insensitive  to  changes  in  the  nominal.  The 
resulting  characteristic  locus  description  for  the  perturbed  system  paves 
the  way,  as  highlighted  in  ( DAN2 ] ,  for  a  systematic  design  methodology  based 
on  the  manipulation  of  the  characteristic  locus  bands  in  a  robust  manner. 
Unfortunately,  the  design  procedure  may,  in  this  case,  generate  conservative 
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results  due  to  the  fact  that  the  inclusion  bands  are  not  tight  (i.e. 
boundary  points  of  the  given  eigenvalue  inclusion  regions  may  not  he 
attainable  for  any  A  in  the  specified  class). 

In  search  of  non-conservative  results,  it  should  be  recognized  that  the 
eigenvalues  of  G  +  A  are  given  by  the  solutions,  z,  of: 

|G  +  A  -  zl |  =  0  or  equivalently  a(G  -  zl  +  A  )  =  0 

Using  the  singular  value  inequality  [GAN1], 

a(G  +  A  -  zl)  >  a(G  -  zl)  -  a(A),  - (2.16) 

the  following  theorem  may  be  stated: 

Theorem  2.1:  [DAN3J  Let  z  e  C  and  A  e  D^,  then 

(i)  if  o(G-zI)  >  S,  z  is  not  an  eigenvalue  of  G  +  A  for  any  A  t  D^; 
(ii)  if  cr(G-zI)  <  S,  there  exists  a  A  t  such  that  z  is  an  eigenvalue 
of  G  +  A. 

This  result  leads  directly  to  the  construction  of  exact  characteristic  locus 

i  6 

inclusion  regions  (referred  to  as  E-contours).  If  z  =  X.  +  p(0)  eJ  where 

Xj  represents  an  eigenvalue  of  the  nominal  system  at  a  given  frequency, 

0  <  0  <  2n  and  p(0)  e  R+,  then  the  first  solution,  z,  of  a(G  -  zl)  =  6  as 
p(0)  increases  identifies  a  point  on  the  boundary  of  the  region  at  the 
specified  frequency.  It  has  been  shown  [DAN3]  that  the  E-contours  formed  in 
this  manner  generate  simply-connected  closed  curves,  the  union  of  which 
contains  the  characteristic  loci  of  the  perturbed  system.  Once  these  bounds 
are  established,  robust  stability  can  be  assessed  using  the  generalized 
Nyquist  criterion  and,  since  the  characteristic  loci  of  Gp  will  lie  on  the 

boundary  for  some  A  e  D^,  the  resulting  stability  criterion  is  necessary  and 

sufficient  (providing  an  alternative  to  the  Doyle/Stein  criterion  identified 
by  condition  2.13).  Indeed,  the  equivalence  of  these  criteria  can  be 
established  by  setting  z  =  -1  in  Theorem  2.1.  The  key  additional  feature  of 
the  E-contour  result,  however,  is  the  addition  of  phase  information  to  the 
uncertainty  description;  an  addition  which  preserves  the  classical  Nyquist 
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attributes  of  the  frequency-domain  description  of  the  system  and  permits  the 
assessment  of  both  robust  stability  and  robust  performance. 

2.2.2  The  Case  of  Structured  Perturbations 

Although  the  assumption  of  unstructured  perturbations  produces  a 
particularly  convenient  mathematical  description  of  uncertainty  for  analysis 
purposes,  this  description  cannot  accommodate  information  on  individual 
elements  or  blocks  of  the  perturbation.  In  most  practical  situations 
however,  any  non-arbitrary  estimate  for  the  upper  bound  on  a(A)  generally 
implies  that  this  extra  information  is  available.  Under  these 
circumstances,  a  robustness  analysis  based  on  unstructured  uncertainty  may 
be  unduly  conservative  due  to  the  fact  that  this  additional  information  has 
been  ignored  and,  hence,  that  the  true  class  of  allowable  perturbations  has 
not  been  precisely  defined.  To  reduce  this  conservatism,  the  description  of 
the  class  of  perturbations  under  investigation  must  be  refined  and 
appropriate  analysis  tools  to  accommodate  these  refinements  must  be 
developed.  Several  particularly  useful  descriptions  of  uncertainty  for  this 
purpose  have  been  identified  and  the  classes  of  perturbations  associated 
with  these  descriptions  are  defined  by: 

(i)  the  class  of  "block-diagonal"  perturbations: 

Dq  =  {A  e  Cmxm:  A  =  diag{A. .}  ;  a(A. .)  <  5  c  R+)  _ (2.17a) 

(i.e.  A^  is  an  unstructured  block  along  the  diagonal  of  A); 

(ii)  the  class  of  "block-structured"  perturbations: 

Dg  =  {As  Cmxm:  A  =  [  [  A.  j  )  | ;  ^A^)  <S  t  R+)  ....(2.17b) 

where  A  =  ((A^]]  indicates  that  the  matrix  A  is  composed  of  several 
arbitrarily-positioned,  unstructured  blocks;  and 

(iii)  the  class  of  element-by-element,  bounded  perturbations: 

Ds  =  (i  £  CmXni:  |  A..  |  <  p..  c  R+;  arg(A..)  =  9.. ,  0  <e..<  2n) 

=  {A  e  Cmxm:  A+  <  P  }  _ (2.17c) 
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To  initiate  investigations  of  the  structured  perturbation  problem, 


'  0  if  no  A  e  DD  solves  |I  +  MA|  =  0 
,  [min  a(A):  |I  +  MA|  =  0]_1;  A  e  D. 


_ (2.18) 


Doyle  [D0Y2]  defined  the  function  )i  such  that: 

U(M)  = 

where  M  =  (I  +  G)_1.  Using  this  definition,  a  necessary  and  sufficient 
condition  for  robust  stability  is: 

p(M)  S  <  1  - (2.19) 


In  effect,  condition  2.19  reiterates  the  spectral  radius  condition  defined 
by  condition  2.11  except  that  now  p(M)  (and  any  upper  bounds  generated  for 
it)  must  explicitly  account  for  the  structure  of  the  assumed  perturbations. 

One  means  of  accomplishing  this  task  is  the  use  of  eigenvalue¬ 
preserving  scaling  techniques.  As  shown  in  [D0Y2]  and  [SAFI],  similarity 
scaling  with  a  positive,  diagonal  scaling  matrix  S  may  be  used  to  attack  the 
problem  of  diagonal  perturbations.  In  particular, 

sup  p(M  A)  =  sup  p(S  MAS-1)  =  S  sup  p(SMS-1U)  =  6  sup  p(M  U) 

A  C  Dp  A  e  Dp  U  U 

where  U  is  a  unitary  matrix  containing  the  phase  information  associated  with 

A  e  Dp.  Under  these  circumstances,  p  takes  the  form: 

U  =  sup  p(M  U)  <  inf  [a(SMS-1)}  ....(2.20) 

U  S 

Thus,  an  upper  bound  on  p  (which  incorporates  structural  information)  can  be 
identified  by  selecting  S  to  minimize  a(SMS-1).  This  similarity-scaling 
approach  also  applies  to  the  more  general  block-structured  problem  since  any 
block-structured  matrix  can  be  rearranged  into  block-diagonal  form  using 
appropriate  eigenvalue-preserving  transformations  [D0Y2J.  Doyle  has  also 
shown  that  equality  in  eqn  2.20  holds  for  up  to  three  distinct  blocks  and. 


hence,  a  stability  criterion  based  on  inf  [a(SMS  1)}  is  necessarv  and 

S 

sufficient  for  at  least  this  set  of  structured  perturbation  problems. 

The  block-structured/similarity-scaling  approach  can,  of  course,  be 
applied  directly  to  element-by-element,  bounded  perturbations  as  well.  In 
this  case  however,  the  dimension  of  the  optimization  problem  increases  from 
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ml  to  m"-l,  and  the  complexity  of  the  optimization  problem  increases 
commensurately .  As  a  result,  additional  research  has  focused  on  generating 
alternative  criteria  for  this  problem.  Early  efforts  ([KAN1J,  [LUN1], 
[0WE1])  used  positive  matrix  results  based  on  the  spectral  radius 
relationship: 

p(MA)  <  p(M+P) 

to  generate  stability  criteria  which  can  be  easily  computed.  However, 
because  these  criteria  ignore  available  phase  information  on  M,  the 
resulting  analysis  tends  to  be  unnecessarily  conservative.  Kouvari takis  and 
Latchman  have  shown  that  such  conservatism  can  be  reduced  using  scaling 
techniques  (K0U4).  Because  of  the  element-by-element  nature  of  the  problem, 
it  is  possible,  as  demonstrated  below,  to  introduce  all  of  the  flexibility 

required  by  using  2m-l  "non-similarity"  scaling  parameters  instead  of  the 

2 

m  -1  similarity  scaling  parameters  discussed  above. 

Returning  to  condition  2.11,  a  necessary  and  sufficient  condition  for 

robust  stability  is  given  by  p(MA)  <  1.  But, 

p(MA)  =  p(R_1MAR)  =  p(R_1ML_1LAR)  <  o(R-1ML_1LAR) 

<  a(R-1ML_1 )  ct(LAR)  - (2.21) 

where  L  and  R  are  positive,  diagonal  matrices.  Hence,  selecting  L  and  R  to 

minimize  the  "cross-condition  number"  defined  by: 

k { P , M— 1 }  =  5(LPR)  /  a( LM_1R)  - (2.22) 

will  significantly  reduce  the  calculated  upper  bound  on  p(MA)  and  the 

corresponding  stability  criterion  can  be  made  less  conservative.  In  fact, 

condition  2.21  suggests  that  (after  implementing  non-similarity  scaling)  the 

only  remaining  conservatism  arises  from  either  of  the  two  inequalities: 

p(R-1ML-1LAR)  <  <x(R-1ML-1LAR)  or  o(R_1ML_1LAR)  <  a(R_1ML_1)  a(LAR) 

So,  the  conservatism  in  the  resulting  stability  condition  will  be  completely 

eliminated  if  the  equalities  in  both  these  relationships  hold 

simultaneously.  As  shown  in  (K0U5),  this  goal  is  attained  when  certain 

relationships  between  M  =  R”^ML’^  and  the  "worst-case"  A  =  LAR  exist.  In 
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particular,  if  the  major  input  and  output  principal  directions  of  a  matrix  A 
(yA  and  x^,  respectively)  are  defined  as  the  eigenvectors  associated  with 

the  maximum  singular  value  of  A  in  the  following  manner: 

*  _  _2  -  *  _  _2  - 
A  A  y  =  c  y  AAx=crx 

it  is  possible  to  establish  the  following  result: 

THE  MAJOR  PRINCIPAL  DIRECTION  ALIGNMENT  (MPDA)  PRINCIPLE:  [K0U5J 

p(MA)  =  ff(M)  a(A)  if,  and  only  if,  x..  =  e^  y~  and  y~  =  x~  (i.e. 

M  AM  A 

the  major  principal  directions  of  M  and  A  are  'aligned'). 

Based  on  this  result,  a  singular  value  analysis  of  robust  stability 
will  yield  necessary  and  sufficient  conditions  if,  and  only  if,  MPDA  holds 
for  M  and  the  "worst-case"  A.  It  is  interesting  to  note  here  that,  for  the 
case  of  unstructured  uncertainty,  MPDA  is  always  satisfied  without  scaling 
due  to  the  phase  flexibility  in  the  prescribed  perturbation  class.  However, 
the  structural  properties  associated  with  the  class  of  perturbations  defined 
by  Dg  necessarily  constrains  the  principal  directions  of  A  and  so  scaling 
must  be  used  to  precondition  the  problem  with  a  view  towards  achieving  MPDA 
In  fact,  it  has  been  shown  (K0U5]  that,  for  independent  element-by-element 
perturbations,  the  scaling  matrices  (L  and  R)  which  minimize  the  cross¬ 
condition  number  defined  by  eqn  2.22  also  equalize  the  moduli  of  the 

elements  of  (x~  ,  y~)  and  (x_  ,  y~)  simultaneously  provided  that  either  M  is 
M  A  AM 

2x2  or  the  maximum  singular  value  of  R~^ML~*  has  a  stationary  point. 
Furthermore,  because  the  phases  of  the  elements  of  A  can  be  adjusted 
independently,  it  is  always  possible  to  select  a  perturbation  A  t  Dg  which 
aligns  the  principal  directions  for  the  given  scaling.  When  combined,  these 
results  produce: 

The  Cross-Condition  Number  Criterion:  [K0U5] , [LAT1 ]  For  all  s  on  the 

Nyquist  D-contour, 


32 


(i)  Let  G(s)  denote  the  nominal  open-loop  transfer  function  matrix  of 
a  linear,  time-invariant  multivariable  system  which  is  stable 
under  unity  feedback; 

(ii)  let  A(s)  be  a  stable  transfer  function  matrix  such  that  A(s) 

exists  in  D  ;  and 
s 

(iii)  assume  that  G(s)  is  2  x  2  or  that  the  cross-condition  number  (eqn 
2.22)  has  a  stationary  point. 

Then  the  perturbed  system,  G(s)  +  A(s),  is  guaranteed  to  be  stable  if  and 
only  if  the  optimal  cross-condition  number  defined  by: 

k  {P,M-1}  =  inf  (a(LPR)  /  a(LM_1R))  _ (2.23) 

°  L(s),  R(s) 

is  less  than  unity  for  all  s. 

Although  developed  originally  for  independent  element-by-element  bounded 
perturbations,  this  result  has  also  been  extended  to  the  more  general  block- 
structured  problem.  For  independent  block-structured  perturbations, 
diagonal  similarity  scaling  provides  enough  degrees  of  freedom  to  achieve 
MPDA.  However,  non-diagonal  scaling  is  required  when  dependent  perturba¬ 
tions  exist  and,  when  the  cross-condition  number  does  not  have  a  stationary 
point,  a  general  principal  direction  alignment  property  must  be  combined 
with  the  eigenvalue  shift  property  to  compute  the  desired  bound.  The 
details  of  these  additional  developments  are  beyond  the  scope  of  this 
summary  and  are  not  directly  relevant  to  the  following  discussions.  For 
these  reasons,  the  extensions  will  not  be  discussed  further  here.  An  >" 
depth  development  of  these  results  can,  however,  be  found  in  |DAN4],  |KO|lA|, 
and  (K0U7J . 

As  was  true  for  the  unstructured  uncertainty  analysis  discussed 
previously,  the  structured  perturbation  criteria  (eqns  2.19  and  2.23)  focus 
strictly  on  system  gain  to  produce  an  immediate  assessment  of  the  magnitude 
of  uncertainty  that  can  be  tolerated  while  maintaining  closed-loop 
stability.  Again,  phase  information  has  been  discarded  and,  hence,  these 
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criteria  cannot  be  used  to  extend  the  analysis  beyond  the  assessment  of 
stability.  To  overcome  this  shortcoming,  an  extension  to  the  unstructured 
E-contour  results  discussed  previously  has  been  proposed  based  on  the 
following  result: 

Theorem  2.2:  [K0U4] , [K0U6]  Let  the  points  on  the  E-contours  constructed 

for  the  unstructured  perturbations,  A  e  Dy,  be  denoted  by 
i  6 

zb  =  N  +  PE^®^  eJ  ’  an<*  ^et  e  R+  t>e  t*ie  smallest  value  such  that 

kQ{  P  ,  [G  -  (zb  -  5p.(0)  ej6}  I]  }  =  1. 

Then,  the  loci  of  points 

X.  +  {  Pe(6)  -  6Pi(0)}  ej0 
i 

form  closed  curves  which  contain  the  eigenvalue  inclusion  regions  for  the 
structured  perturbations,  A  £  Dg. 

By  invoking  MPDA,  the  resulting  structured  E-contour  bounds  can  be  shown  to 
be  nonconservative  and,  hence,  the  generalized  Nyquist  criterion  establishes 
a  necessary  and  sufficient  condition  for  robust  stability.  In  addition,  the 
introduction  of  phase  information  on  the  perturbed  system  has  direct 

implications  for  the  assessment  of  robust  performance.  Thus,  it  becomes 

possible  to  extend  a  complete  Nyquist  analysis  to  systems  with  realistic 
perturbations  (such  as  those  defined  by  the  class  of  element-by-element 
bounded  perturbations)  provided  appropriate  information  on  the  size  and 
structure  of  the  allowable  perturbations  is  available. 

Before  proceeding  on,  it  must  be  pointed  out  that  all  of  the  frequency 
response  techniques  summarized  in  this  chapter  for  continuous-time  systems 
are  equally  valid  for  discrete-time  systems  with  s  =  jw  replaced  by 

z  =  exp(jwT).  Indeed,  the  remaining  chapters  of  the  thesis  will  focus 

primarily  on  the  discrete-time  formulations  of  these  methods. 
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CHAPTER  THREE 


STATISTICAL  UNCERTAINTY  POR  SISO  SYSTEMS 


A  key  missing  element  in  each  of  the  uncertainty  analysis  techniques 
described  in  Chapter  2  is  the  ability  to  quantify  the  frequency  response 
uncertainty  associated  with  a  given  system  model.  Yet,  the  practical  value 
of  these  techniques  in  control  design  can  only  be  fully  realized  when  this 
information  is  known.  For  situations  where  system  identification  is  used  to 
generate  system  descriptions,  it  is  possible  to  quantify  system  uncertainty 
in  statistical  terms.  The  question  that  then  arises  is  whether  or  not  the 
available  statistical  information  can  be  manipulated  to  produce  an  accurate 
multivariable  characterization  of  frequency  response  uncertainty.  A  logical 
first  step  towards  solving  this  problem  is  the  development  of  frequency 
response  uncertainty  descriptions  for  scalar  systems. 

Either  of  two  widely-recognized  identification  approaches,  spectral 
estimation  (to  estimate  frequency  response  directly  from  input/output  data) 
or  parameter  estimation  (to  generate  frequency  response  estimates  using  the 
identified  system  transfer  function),  may  be  used  to  estimate  system 
frequency  response.  However,  the  additional  requirement  to  quantify  the 
uncertainty  associated  with  the  frequency  response  estimates  implies  that 
any  proposed  methodology  must  not  only  be  able  to  generate  accurate 
frequency  response  estimates,  it  must  also  be  able  to  guarantee  the  accuracy 
of  the  uncertainty  description  at  individual  frequencies  and  to  characterize 
the  interfrequency  dependence  of  the  uncertainty  that  arises  from  the  use  of 
finite  data  sets.  Furthermore,  the  resulting  description  for  scalar  systems 
should  be  compatible  with  the  desired  goal  of  quantifying  multivariable 
uncertainty  for  use  in  a  generalized-Nyquist  analysis  of  perturbed 
multivariable  systems  as  described  in  Chapter  2. 

Research  has  recently  begun  to  address  the  problem  of  characterizing 
frequency  response  uncertainty,  and  some  interesting  results  have  already 
been  produced.  Using  spectral  estimation,  Loh  et  al  (L0H1)  have  generated 
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confidence  bounds  for  the  magnitude  and  phase  of  system  frequency  response 
at  selected  individual  frequencies.  However,  due  to  the  low  signal-to-noi se 
ratios  that  are  inherently  present  at  high  frequencies,  the  resulting 
uncertainty  bounds  will  be  unnecessarily  large  at  these  frequencies.  In 
addition,  these  results  fail  to  address  the  topic  of  interfrequency 
dependence.  So  the  bounds  generated  at  any  one  frequency  are  valid  only  in 
isolation;  a  significant  problem  when  one  is  concerned  about  uncertainty  at 
several  frequencies  simultaneously  for  stability  and  gain/phase  margin 
assessments.  Finally,  it  should  be  noted  that  this  procedure  establishes 
distinct  gain  and  phase  bounds  which  lead  to  a  characterization  of 
uncertainty  that  is  not  particularly  well  suited  to  generalized-Nyquist 
extensions  for  multivariable  systems. 

The  problems  highlighted  above  are  generally  applicable  to  methods 
which  utilize  spectral  estimation.  For  this  reason,  it  appears  that 
parameter  estimation  techniques  may  well  be  better  suited  to  the  problem  at 
hand.  Some  results  in  this  area  are  also  available.  Ljung  and  Yuan  [LJUl] 
and  Ljung  l LJU2 ]  have  produced  asymptotic  variance  expressions  for  the 
frequency  response  estimates  at  specified  individual  frequencies  which  can 
be  used  to  generate  confidence  bounds  at  these  frequencies,  while  Edmunds 
[EDM2]  has  proposed  a  procedure  to  generate  confidence  bounds  at  individual 
frequencies  by  transforming  parameter  estimate  statistics  into  corresponding 
frequency  response  statistics.  As  with  spectral  estimation  however,  these 
developments  are  valid  only  at  individually  specified  frequencies  and, 
hence,  fail  to  account  for  interf requency  dependence.  In  addition,  Edmunds' 
procedure  relies  on  a  single-term  Taylor  series  approximation  to  establish 
the  required  transformation  and,  thus,  produces  only  an  approximate  result. 
Janiszowski  [JAN1]  has  also  addressed  the  frequency  response  uncertainty 
problem  and  has  produced  distinct  gain  and  phase  bounds  that  are  valid 
simultaneously  over  all  frequencies.  But,  these  bounds  are  not  exact  and, 
because  the  gain  and  phase  bounds  are  separate,  this  uncertainty  description 
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is  not  well  suited  for  extensions  to  a  generalized-Nyquist  analysis  of 


perturbed  multivariable  systems.  Thus,  existing  techniques  fail  to  produr*. 
the  desired  results. 

This  chapter  also  addresses  the  problem  of  characterizing  frequency 
response  uncertainty  from  a  parameter  estimation  perspective.  A  technique 
is  developed  to  establish  statistically-derived  bounds  on  system  frequency 
response  from  corresponding  bounds  on  the  parameters  of  the  selected  model. 
These  bounds  are  valid  simultaneously  over  all  frequencies;  thus  accounting 
for  the  interfrequency  dependence  of  the  estimates.  Furthermore,  this 
characterization  defines  confidence  regions  in  the  complex  Nyquist  plane 
rather  than  separate  bounds  on  gain  and  phase,  so  it  can  be  readily  adapted 
to  multivariable  systems.  The  development  begins  with  a  review  of  the 
parameter  identification  problem  and  a  discussion  of  model  selection 
considerations  in  Section  3.1.  Confidence  limits  on  the  parameters  of  the 
selected  system  model  are  developed  in  Section  3.2  using  statistical 
information  on  the  parameter  estimates.  These  confidence  bounds  are  then 
transformed  into  frequency  response  bounds  using  results  derived  in  Section 
3.3.  Methods  to  modify  and  improve  the  resulting  frequency  response  bounds 
are  discussed  in  Section  3.4,  and  the  chapter  concludes  with  a  simulation 
example  to  demonstrate  the  procedure. 

3.1  Model  Parameter  Identification 

3.1.1  Least-Squares  Estimation  and  Related  Statistical  Results 

For  applications  where  parametric  models  are  to  be  generated  from 
available  data,  scalar  discrete-time  dynamical  systems  may  be  described  hv 
the  input/output  relationship: 

y(kT)  =  dl(kT)  0°  +  e(kT)  _ (3.1) 

where  0°  is  a  (qQ  x  1)  vector  of  true  model  parameters, 

d*(kT)  is  a  (1  x  qQ)  vector  of  past  inputs  and  outputs, 
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e(kT)  is  an  unknown,  random  error, 
and  T  is  the  specified  sample  time  for  the  system. 

When  a  set  of  N  output  observations  is  available,  eqn  3.1  can  be  used  to 
generate  a  vector  equation  of  the  form: 

y  =  D  0°  +  e  _ (3.2) 

where  y  is  a  (N  x  1)  vector  of  outputs,  y(T) , • • • ,y(NT) 

D  is  a  (N  x  qQ)  data  matrix  containing  d^fkT),  k=l,***,N 
and  t  is  a  (N  x  1)  vector  of  random  errors,  E(kT),  k=l,***,N. 

For  this  formulation  of  the  problem,  several  distinct  parameter 
estimation  techniques  exist  (SCH1J.  Among  these,  the  most  widely-used 
method  is  the  least-squares  algorithm  which  derives  parameter  estimates  to 
minimize  the  sum  of  the  squares  of  the  differences  between  measured  and 
estimated  system  outputs.  Using  this  procedure,  the  parameter  estimates  are 
given  by: 

9  =  <DTD)-1  DT  y  - (3.3) 

[Note:  From  this  point  on,  vector  transposition  will  be  denoted  by  the 

superscript  't'  and  matrix  transposition  will  be  denoted  by  the  superscript 

'T' . ]  Errors  in  the  observations  ensure  that  0  is  a  random  vector  and, 

under  the  assumptions  that  the  elements  of  D  are  deterministic  and  e  is 

2 

Gaussian  white  measurement  noise  with  zero  mean  and  covariance  a  I,  it  can 

e 

be  shown  (e.g.  [ FRA1 ] )  that  0  is  Gaussian  with  mean  and  covariance  given  by: 

E  { 0}  =  0°  _ (3.4a) 

V  =  E{ ( 0  -  0°)  (0  -  0°)*)  =  a2t  (DTD)_1  _ (3. Ah) 

3.1.2  Model  Structures 

Least-squares  parameter  estimates  can  be  obtained  for  any  model 
structure  which  defines  an  input/output  relationship  of  the  form  shown  in 
eqn  3.1.  However,  the  accuracy  of  the  statistical  information  associated 
with  these  estimates  depends  on  the  validity  of  the  assumptions  highlighted 
above.  To  establish  an  accurate  description  of  system  uncertainty,  a  model 
structure  which  satisfies  these  assumptions  must  be  selected. 
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One  commonly  used  model  structure  is  the  difference  equation  model 


defined  by: 

y(kT)  =  b(z_i)  u(kT)  - (3.5) 

a(z  ) 

where  b(z  and  a(z-^)  are  polynomials  in  the  delay  operator,  z~\  whose 
coefficients  define  the  model  parameters.  For  the  case  of  white 
measurement  noise  however,  this  model  fails  to  produce  accurate  statistical 
information  on  0  because  the  noise  associated  with  the  model  is  not  white. 
Furthermore,  the  data  matrix  D  contains  past  values  of  the  observed  (noisy) 
output  so  that  D  is  correlated  with  the  measurement  noise.  Indeed,  this 
second  problem  persists  even  if  the  model  noise  is  white.  As  a  result,  the 
expectation  arguments  used  to  derive  the  mean  and  covariance  of  0  are  only 
valid  asymptotically.  Despite  these  statistical  problems,  the  difference 
equation  model  is  often  used  for  applications  requiring  accurate  output 
prediction  because  of  the  relatively  small  number  of  parameters  required  to 
describe  the  system. 

To  generate  accurate  statistical  information  from  a  finite  number  of 
measurements,  other  model  structures  must  be  considered.  A  particularly 
useful  model  structure  for  this  purpose  is  the  weighting  sequence  model 
whose  parameters  are  related  to  system  output  via  the  convolution  summation: 

OD 

y(kT)  =  r  g.  u( [ k-i ]T)  +  e(kT)  ....(3.6) 

i=-°> 

Under  the  assumptions  that  the  system  is  stable,  strictly  causal  (i.e.  the 
system  does  not  respond  to  inputs  which  have  yet  to  be  applied,  nor  does  it 
respond  instantaneously  to  a  current  input)  and  initially  at  rest,  the 
infinite  convolution  summation  above  can  be  accurately  approximated  by  the 
truncated  summation  [CADI]: 

q 

y(kT)  =  l  g.  u( [k-i ]T)  +  e(kT)  ....(3.7) 

i=l  1 

where  q  represents  the  number  of  significant  elements  in  the  model.  The  use 
of  this  finite-dimensional  weighting  sequence  model  avoids  the  problems 


mentioned 

previously, 

as 

the  measurement 

noise 

associated  with  the  model 

remains 

white  and 

the 

data  matrix 

D 

does 

not 

contain  past  noisy 

measurements.  Thus, 

the 

assumptions 

required 

to 

develop  the  estimate 

statistics  in  eqns  3.4a,b  are  satisfied. 

One  additional  feature  of  the  weighting  sequence  model  is  its  linear 
relationship  to  system  frequency  response.  The  z-domain  transfer  function 
for  this  model  is  given  by: 

-1  -2  -q 

g(z)  =  g1z  +  g2z  +  • • •  +  g^z 

=  ht(z_1)  9  _ (3.8) 

where  ht(z-^)  =  [z-*  z-^  —  z~q]  and  9  =  [g^  g2  •••  g^]1.  System 


frequency  response  can  then  be  identified  using  the  expression 
z  =  exp(jwT)  -  cos(wT)  +  j  sin(wT) 

for  all  frequencies  «T  <  n.  Because  exp(jwT)  is  generally  complex, 
g{exp(j«T)}  can  be  rewritten  as  the  following  two-dimensional  vector: 


g{exp(jwT)} 


greal 

cos(wT) 

cos(2wT) 

cos(qwT) 

.  gimag 

-sin(wT) 

-sin(2«T) 

-sin(qwT)  . 

g  =  HT  9  ....(3.9) 


The  linear  relationship  defined  by  eqn  3.9  is  an  especially  important  tool 
for  transforming  statistical  information  on  the  model  parameters  into 
corresponding  frequency  response  uncertainty  information.  This  development 
will  be  discussed  at  length  in  Section  3.3.  However,  it  is  important  to 
note  here  that  weighting  sequence  models  produce  this  linear  relationship 
directly,  whereas  similar  relationships  for  difference  equation  models  can 
only  be  established  using  single-term  Taylor-series  approximations  (as  shown 
in  [ EDM2 ] ) .  This  difference  provides  further  justification  for  the 


selection  of  weighting  sequence  models  to  generate  accurate  statistical 
descriptions  of  frequency  response  uncertainty. 


3.2  Model  Parameter  Confidence  Regions 


3.2.1  Bounds  Based  on  the  Chi-Square  Distribution 

The  least-squares  parameter  estimates  for  a  weighting  sequence  model, 
as  mentioned  earlier,  are  normally-distributed  random  variables.  Thus,  they 
can  be  used  to  construct  confidence  regions  for  the  true  parameters.  Rather 
than  generate  bounds  for  each  individual  parameter  as  done  by  Janiszowski 
[JAN1],  it  will  be  useful  for  further  frequency  response  developments  to 
establish  bounds  on  a  scalar  quadratic  function  of  the  error  between  the 
true  and  estimated  parameters  instead.  Lemma  3.1  identifies  the  desired 
function  and  its  probability  distribution. 

Lemma  3.1:  [ DEU1 ]  For  the  parameter  estimates  and  associated  statistics 

in  eqns  3.3  and  3.4a, b,  let  A0  =  6  -  0°.  Then  A0  is  normally  distributed 

2  T  -1 

with  zero  mean  and  covariance  V  =  a  (D  D)  ,  and 

Q  =  A©1  V'1  A0  _ (3.10) 

is  a  chi-square  random  variable  with  q  degrees  of  freedom  (where  q  is  the 
number  of  estimated  parameters). 

Proof:  The  distribution  and  statistics  of  A0  follow  immediately  from  its 

definition  and  eqns  3.4a,b.  The  distribution  of  Q  can  be  established  in 

2  T  -1 

the  following  way.  Since  V  =  (D  D)  ,  its  eigenvalues  are  all  positive 
and  it  can  be  rewritten  in  eigenvalue/eigenvector  form  as 

V  =  U  I2UT  with  UT  U  =  I  _ (3.11) 

Now,  let 

A0  =  E-1  UT  A0  _ (3.12) 

Then,  E{A0}  =  0  and  E{A0  AO*}  =  E_1UT  E{A0  A©* }  U  E"1  =  I.  Hence,  the 
elements  of  A0  are  standard  normal  variables  and,  since  Q  is  simply  the 
sum  of  the  squares  of  these  elements,  it  has,  by  definition,  a  chi-square 
distribution  with  q  degrees  of  freedom. 

. . . .QED 


As  0  is  a  chi-square  random  variable,  confidence  bounds  can  be 

established  using  the  cumulative  chi-square  distribution.  For  a  given 

confidence  level  a,  the  probability  statement  associated  with  Q  is: 

P{Q  <  Q  }  =  a  x  100% 

-  a,q 

where  Q  is  a  known  constant  obtained  from  cumulative  chi-square  distribu¬ 
te,  q 

tion  tables  or  calculated  using  a  normal  approximation  as  described  in 

Section  3.2.3.  The  corresponding  a  x  100%  confidence  bound  is  defined  by: 

Q  =  A0l  V"1  A9  =  Q  - (3.13) 

a,q 

Since  V  is  positive  definite  and  Q  is  quadratic  in  A0,  the  region  0(0^^ 

defines  an  ellipsoid  in  the  q-dimensional  parameter  space  centred  at  0  and 
containing  the  true  model  parameters  with  a  x  100%  confidence. 

3.2.2  Bounds  Based  on  the  F  Distribution 

A  difficulty  in  producing  the  chi-square  confidence  ellipsoid  above 

2 

stems  from  the  assumption  that  the  noise  variance  is  known;  an  assumption 
required  to  establish  Q  as  a  chi-square  random  variable.  In  practice,  this 

variance  is  not  known  and  can,  at  best,  be  estimated  from  the  available 

parameter  estimates  and  data  measurements.  For  this  situation,  the  results 
of  Section  3.2.1  can  be  extended  as  shown  below. 

To  begin,  the  following  lemma  identifies  an  estimate  for  the  noise 
variance  and  the  probability  distribution  associated  with  this  estimate. 

Lemma  3.2:  [KEN1]  For  the  parameter  estimates  defined  by  eqn  3.3, 

s2  =  {(y  -  D  Q)t( y  -  D  0)}  /  {N  -  q)  (3. VO 

2  2  7 
is  an  unbiased  estimate  of  Furthermore,  the  quantity  (N  -  q)  s  /n^  ' - 

a  chi-square  random  variable  with  (N  -  q)  degrees  of  freedom. 

Lemma  3.2  is  simply  an  extension  of  standard  results  when  N  observations  and 
q  estimated  parameters  are  used  to  obtain  the  variance  estimate. 

Using  the  results  of  Lemmas  3.1  and  3.2,  it  is  possible  to  verify  the 
following  theorem: 

Theorem  3.1:  Let  A0  =  9  -  9°  and  let  s2  represent  the  estimated  variance 
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of  the  noise  (defined  by  eqn  3.14).  Then,  the  quantity 

F  =  {  AO*  V"1  A8  )  /  q  with  V  =  s2  (DTD)_1  _ (3.15) 

has  an  F  distribution  with  q  numerator  degrees  of  freedom  and  N  -  q 
denominator  degrees  of  freedom. 


Proof:  An  F-distributed  variable  is  defined  as  the  ratio  of  two  chi- 

square  variables  (Q^,  Q2)  in  the  following  way: 

F  =  (Q1/n1)  /  {02/n2} 

where  and  Q2  have  n^  and  n2  degrees  of  freedom  respectively  [GIB1]. 

From  Lemma  3.1,  Q  =  A8*  V  ^  A0  is  chi-square  distributed  with  q  degrees  of 

2  2 

freedom,  and  Lemma  3.2  demonstrates  that  (N-q)s  /<j^  has  a  chi-square 
distribution  with  N  -  q  degrees  of  freedom.  Now, 


Ae*1  V-1  A8  _  AG1  (PTP)  A8  _  (^9  (°  °)  AQJ/qa^  Qj  /  q 

q  q  s2  {(N-q)  s2/a2}/(N-q)  °2  7  (N  "  q) 


where  0^  =  0  (as  per  eqn  3.10)  nd  Q2  =  (N-q)  s  (as  per  Lemma  3.2). 
By  definition,  this  quantity  is  F-distributed  with  q  numerator  degrees  of 


freedom  and  N-q  denominator  degrees  of  freedom. 


.... QED 


Using  Theorem  3.1,  new  ellipsoidal  confidence  regions  can  be  generated 

using  the  cumulative  F-distribution.  The  probability  statement  and 

corresponding  confidence  bound  are  given  respectively  by: 

P(F  <  F  ..  }  =  ax  100%  _ (3.16a) 

a, q , N-q 

A8(  V'1  A8  =  q  F  ..  - (3.16b) 

a,q ,N-q 

where  F  M  is  a  known  constant  obtained  from  cumulative  F-distribution 
q » w-q 

tables  or  calculated  using  a  normal  approximation  as  described  in  Appendix 
3.1. 


Renark:  Theorem  3.1  extends  the  development  of  parameter  confidence 
regions  to  the  more  practical  situation  where  the  noise  variance  is 
estimated  from  available  data.  In  many  situations  however,  the  difference 
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between  chi-square  and  F  statistics  is  negligible.  As  the  number  of 

2  2 

measurements  increases,  s  approaches  a£  and  the  chi-square  and  F  bounds 

become  identical  (see  the  last  column  of  Table  3.1  and  note  that  the 

square  root  is  used  to  give  a  measure  of  the  maximum  distance  from  the 

centre  of  the  uncertainty  region  to  its  boundary).  Hence,  chi-square 

2 

assumptions  may  be  used  with  an  estimate  of  o£  to  establish  the  desired 
parameter  bounds  provided  the  number  of  available  measurements  is 
sufficiently  large.  In  situations  where  this  condition  is  not  satisfied, 
the  desired  bounds  must  be  developed  using  the  appropriate  F  statistic. 

3.2.3  Normal  Approximations  for  Chi-Square  and  F  Statistics 

To  generate  the  confidence  bounds  defined  by  eqns  3.13  and  3.16b, 

statistical  limits  (Q  and  F  )  must  be  obtained  from  the  appropriate 

a,q  a,q,N-q' 

cumulative  distribution.  Although  tables  do  exist,  the  confidence  levels  in 
these  tables  are  typically  restricted  to  a  small  number  of  commonly-used 
values  and  linear  interpolation  between  values  is  inaccurate.  To  overcome 
this  problem,  normal  approximations  for  both  chi-square  and  F  statistics 
have  been  developed.  These  approximations  can  be  used  with  the  standard 


Confidence  Number  of 
Level,  a  measurements 


q=20  200 

0.90  500 

1000 

200 

0.95  500 

1000 

q=40  200 

0.90  500 

1000 

200 

0.95  500 

1000 

Table  3.1:  Compar 


Q 

a,q 

qF  , 

a,q,N-q 

/[qF/Ql 

29.23 

1.015 

28.40 

28.72 

1.006 

28.56 

1.003 

32.58 

1.019 

31.40 

31.85 

1.007 

31.62 

1.003 

54.00 

1.021 

51.80 

52.56 

1.007 

52.16 

1.003 

58.80 

1.027 

55.75 

56.84 

1.010 

56.28 

1.005 

of  Statistical  Limits 


normal  distribution  to  generate  accurate  confidence  limits  for  variables 
with  chi-square  and  F  distributions.  The  Vilson-Hilferty  approximation  for 
chi-square  variables  is  given  by: 


Lemma  3.3:  [WIL2] 
freedom,  then  {Q/q} 


If  Q  is  a  chi-square  random  variable  with  q  degrees  of 
1/3  . 

is  approximately  normal  with  mean  and  variance 


=  1  - 


_2_ 

9q 


2 

a  = 


9q 


_ (3. 17a,b) 


This  approximation  leads  to  the  following  confidence  limits  for  Q: 


Theorem  3.2:  An  a  x  100%  confidence  bound  for  0  is  given  by 

Q  =  q  {u  +  a  Za)3  - (3.18) 

where  Z  is  a  known  constant  obtained  from  the  cumulative  standard  normal 
a 

distribution. 

1/3 

Proof:  From  Lemma  3.3,  {Q/q}  is  approximately  normal  with  mean  and 

variance  given  by  eqns  3.17a,b.  The  probability  statement  and 
corresponding  confidence  bound  for  this  variable  are,  therefore,  given  by: 

P  |  -°/-3)-g3-~-  <  Za  )-  =  ax  100%  and  (Q/q)1/3  =  u  +  aZa 

Solving  the  second  expression  for  Q  produces  the  desired  result. 

- QED 


Similar  confidence  bounds  for  F-distributed  variables  can  be  obtained  using 
the  Peizer-Pratt  approximation  [PEI1]  summarized  in  Appendix  3.1. 


3.3  Frequency  Response  Confidence  Regions 

For  any  specified  level  of  confidence,  the  ellipsoidal  regions 
described  by  eqns  3.13  or  3.16b  define  the  complete  set  of  parametric  models 
for  the  given  system.  These  regions  must,  therefore,  also  contain  the 
information  required  to  completely  quantify  system  frequency  response 
uncertainty.  Using  the  linear  relationship  defined  by  eqn  3.9,  this 
additional  uncertainty  information  may  be  developed  by  transforming  the 
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given  parameter  bounds  into  frequency  response  confidence  limits  which  are 
valid  simultaneously  over  all  frequencies  from  0  to  n/T.  The  following 
theorem  describes  this  transformation: 

Theorem  3.3:  Let  A©  =  ©  -  ©°  and  let  A©*  V-^  A©  =  b  define  an  ellipsoidal 

X  *  o 

boundary  in  the  parameter  space.  If  Ag  =  Ha©  =  g  -  g  is  the  image  of  A© 
in  the  complex  plane  at  any  specified  frequency  «T,  then 

(i)  every  Ag  on  the  frequency  response  ellipse  described  by 

Agt(HTVH)_1Ag  =  b  - (3.19) 

has  at  least  one  pre-image  A©  which  lies  on  the  given  parameter 
boundary; 

(ii)  no  point  A©  on  or  inside  the  ellipsoid  boundary  maps  outside 
this  frequency  response  ellipse. 

Proof:  (i)  Consider  a  point  A©  on  the  boundary  defined  by  A0t  V  ^  A©  =  b, 
and  let  Ag  =  H  A©  represent  the  image  of  A©  in  the  complex  plane  at  some 
frequency  wT.  Now  suppose  the  matrix  S  defines  an  arbitrary  ellipse  in 
the  complex  plane  such  that  Ag(  S  Ag  =  b.  For  both  A©  and  its  image  Ag  to 
lie  on  their  respective  boundaries  simultaneously,  the  following  condition 
must  be  satisfied: 

A0t  V”1  A©  =  Ag*  S  Ag  =  A©1  HSHT  A©. 

This  equation  can  be  rewritten  in  the  following  form: 

A©t  {V-1  -  HSHT}  A©  =  0  (3.?0) 

Clearly,  when  A©  is  selected  to  satisfy  {V~^  -  HSH^}  A©  =  0,  eqn  3.20  will 

be  satisfied  and  both  A©  and  Ag  will  lie  cn  their  specified  boundaries. 

This  condition  implies  that  A©  must  satisfy  the  following  relationship: 

A©  =  V  H  S  HT  A©  - (3.21) 

T  T 

But  Ag  =  H  A0,  so  premultiplying  eqn  3.21  by  H  and  rearranging  the  result 
yields: 

{I  -  HTVHS}  Ag  =  0  - (3.22) 
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If  S  is  selected  to  make  (I  -  H  VHS}  rank  deficient  (by  one  degree),  eqn 

3.22  will  be  satisfied  for  a  single  Ag.  However  if  S  =  (H^VH)  the 

condition  is  satisfied  for  all  Ag  (and  A0) .  Hence,  all  points  on  the 

t  T  -1 

frequency  response  ellipse  Ag  (H  VH)  Ag  =  b  are  images  of  points  on  the 
parameter  space  ellipsoid  A9*  V-^  A0  =  b. 

(ii)  Consider  the  quantity 

S  =  A©1  V-1 A0  -  Agt{HTVH)“1Ag  - (3.23) 

Using  the  transformation  defined  by  eqn  3.12,  the  following  expressions 
can  be  developed: 

AO1  V-1 A0  =  A0t  A0  Ag  =  HTA0  =  HT  A0 

~T  T 

with  H  =  H  11  L  Thus,  eqn  3.23  can  be  rewritten  as: 

6  =  A01  V-1 A9  -  Agt{HTVH}“1Ag  =  A0*  A0  -  AO*  H(HTH)”1HT  A0 

=  A©1 {I  -  H(HTH)_1HT)  A0  - (3.24) 

- 

If  M  is  defined  to  be  the  matrix  representation  of  the  kernel  of  H  (i.e. 

H  M  =  0),  a  matrix  identity  (established  in  Appendix  3.2)  can  be  used  to 

rewrite  eqn  3.24  in  the  following  form: 

&  =  A0*  V”1  A0  -  Ag1  {HTVH}-1  Ag  =  A0l  M(MTM)-1MT  A0  - (3.25) 

~  ~T~  -1~T 

But  the  nonzero  eigenvalues  of  M(M  M)  M  are  equal  to  the  eigenvalues  of 
(MTM)-^  MTM  =  I,  so  M(MTM)~^MT  is  positive  semidef inite.  Thus,  the 
difference  6  must  be  greater  than  or  equal  to  zero  for  all  values  of 
A©  =  E_1UTA0,  and  so 

A0l  V'1  A©  >  Ag1  {HTVH}_1  Ag 

for  all  A0.  This  result  clearly  demonstrates  that  no  point  on  the  given 

parameter  space  ellipsoid  maps  outside  the  corresponding  frequency 

response  ellipse.  It  also  prohibits  points  inside  the  given  parameter 

boundary  from  mapping  to  points  on  or  outside  the  given  frequency  response 

ellipse  since  such  points  must  necessarily  lie  on  a  smaller  ellipsoid  and, 

hence,  must  map  to  points  on  or  inside  a  corresponding  smaller  ellipse. 

.... QEO 

Theorem  3.3  is  an  extension  of  set-theoretic  results  identified  by 
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Schweppe  (SCH1]  to  the  problem  of  generating  confidence  bounds  on  system 
frequency  response.  It  clearly  demonstrates  that  the  ellipsoidal  conf idenop 
region  containing  the  true  weighting  sequence  parameters  maps  completely  to 
a  set  of  uncertainty  regions  in  the  complex  plane  which  contain  the  true 
system  frequency  response  over  all  frequencies  from  0  to  n/T.  The 
boundaries  of  these  frequency  response  regions  are  described  by  eqn  3.19 
where  the  constant  b  is  now  defined  by  the  assumed  distribution  of  the 
parameter  estimates  (i.e.  b  =  Q  or  b  =  q  F  .  In  addition,  the 
maximum  distance  from  the  centre  of  the  region  to  its  boundary  at  any  given 
frequency  «T  is  given  by: 


I  Agl 


max 


.  / 


[b  X  X  (1TVH)] 
max'  1 


_ (3.26) 


The  key  characteristic  of  the  uncertainty  description  developed  here  is 
its  ability  to  quantify  the  interfrequency  dependence  of  the  resulting 
confidence  bounds.  When  finite  data  sets  are  used  to  estimate  system 
frequency  response,  the  uncertainty  information  at  any  one  frequency 
necessarily  depends  upon  uncertainty  information  at  other  frequencies.  A 
complete  description  of  frequency  response  uncertainty  must  account  for  this 
interdependence.  Since  the  description  derived  in  Theorem  3.3  is  generated 
from  a  single  confidence  bound  on  the  model  parameters,  the  bounds  defined 
by  eqn  3.19  are  valid  simultaneously  over  all  frequencies.  Hence,  these 
bounds  quantify  both  the  estimate  uncertainty  at  individual  frequencies  and 
the  interdependence  between  frequencies. 

Theorem  3.3  does  not  preclude  points  which  lie  outside  the  specified 
parameter  ellipsoid  from  mapping  into  these  frequency  response  regions.  To 

investigate  the  existence  of  such  points,  consider  first  a  single  frequency 

'■**  -1  T 

WjT  and  select  a  point  A0  =  E  U  <50  on  the  ellipsoid  boundary  defined  by 
A0  A0  =  b  that  has  a  component  in  the  kernel  of  H  at  w^T.  <50  can  be 

*  **  -k 

written  in  terms  of  Ag  =  H  A0  as: 

A0*  =  H(HTH)_1Ag*  +  M  x 
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where  x  is  a  nonzero  vector  that  identifies  the  component  of  AG*  in  the 
kernel.  Using  this  expression  for  AG  ,  the  following  relationship  can  be 
developed: 


—  ^t  ~  ^  ^t  ~fTi~  1  £  ^  ~rp~ 

A0  AG  =  Ag  A  Ag  +  x1  >TM  x 


- (3.27) 


But  M^M  is  positive  definite,  so  eqn  3.27  implies  that  Ag*  (H*H)  *  Ag*  <  b. 
In  other  words,  the  map  of  A6  in  the  complex  plane  at  lies  inside  the 
given  frequency  response  ellipse.  On  the  basis  of  this  observation,  it  is 
possible  to  identify  points  which  lie  outside  the  parameter  ellipsoid  but 
which  map  inside  or  onto  the  frequency  response  ellipse  at  co^T.  More 


specifically,  any  A0  =  c  A0  ,  where  c  is  a  constant  in  the  range 

1  <  c2  <  b  /  {b  -  x1  MTM  x), 


- (3.28) 


will  satisfy  this  condition.  Furthermore,  if  A0  is  completely  in  the 


~T 

kernel  of  H  ,  an  upper  bound  on  the  modulus  of  these  points  does  not  exist. 

Upon  extending  this  argument  over  all  frequencies  from  0  to  n/T,  it 

becomes  clear  that  the  only  points  outside  the  parameter  ellipsoid  which  map 

into  the  frequency  response  regions  at  each  and  every  frequency  correspond 

to  points  on  the  ellipsoid  boundary  which  have  a  component  in  the  kernel  of 
~T 

H  at  every  frequency.  Such  points  do,  in  fact,  exist.  However,  it  can  be 

shown  that  there  is  no  direction  in  the  parameter  space  that  lies  entirely 

~T 

within  the  kernel  of  H  at  all  frequencies.  As  a  result,  the  problem  points 

outside  the  ellipsoid  are  necessarily  bounded  in  modulus  as  shown  in  eqn 

3.28.  In  addition,  the  correct  upper  bound  on  the  points  in  each  direction 

t  ~T~ 

must  be  calculated  at  the  frequency  where  x  M  M  x  is  a  minimum.  Hence, 
these  points  comprise  only  small  portions  of  the  parameter  space  outside  the 
specified  ellipsoid. 

The  existence  of  this  small  set  of  additional  points  suggests  that  the 
a  x  100%  confidence  associated  with  the  parameter  ellipsoid  is  a  lower  bound 
on  the  confidence  associated  with  the  frequency  response  regions  established 
in  Theorem  3.3.  However,  it  must  be  noted  that  every  point  on  each 
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frequency  response  ellipse  has  a  pre-image  on  the  given  parameter  ellipsoid. 
As  a  result,  the  bounds  defined  by  eqn  3.19  over  all  frequencies  from  0  to 
n/T  represent  attainable  maximum  bounds  on  system  frequency  response  for  the 
prescribed  set  of  possible  models.  Hence,  these  regions  provide  the  best 
possible  multifrequency  description  of  frequency  response  uncertainty  for 
the  specified  problem. 

3.4  Confidence  Bound  Adjustment  Techniques 

The  results  presented  in  the  previous  sections  establish  the 
methodology  for  generating  statistical  bounds  on  system  frequency  response 
for  any  given  set  of  input/output  data.  Of  course,  the  ultimate  goal  of  the 
identification  process  is  to  produce  the  most  accurate  information  possible. 
Within  this  context,  the  statistical  nature  of  these  uncertainty 
descriptions  provides  special  opportunities  to  modify  and  improve  the 
frequency  response  confidence  bounds  at  specific  frequencies  of  interest. 
Two  particularly  important  techniques  to  accomplish  this  task  are  discussed 
below,  and  their  effects  on  frequency  response  uncertainty  will  be 
demonstrated  by  example  in  Section  3.5. 

3.4.1  Test  Input  Selection 

Although  some  situations  may  dictate  certain  restrictions,  the  user  is 
generally  free  to  select  the  test  conditions  (e.g.  system  inputs,  sampling 
rates,  number  of  samples,  etc.)  for  any  given  identification  test.  It  is 
widely  recognized  that  this  extra  freedom  can  be  used  to  improve  the 
accuracy  of  the  resulting  system  description.  For  open-loop  identification, 
the  selection  of  test  inputs  plays  a  particularly  important  role  in  »hj 
area,  and  a  great  deal  of  research  has  focused  on  the  design  of  inputs  which 
are  optimal  in  some  "estimate  accuracy"  sense.  Historically,  the  goal  of 
time-domain  identification  has  been  parameter  estimation.  Hence,  optimal 
input  design  research  has  focused  almost  exclusively  on  generating  accurate 
parameter  estimates,  and  comprehensive  results  in  this  area  have  been 
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presented  by  Goodwin  and  Payne  [G001)  and  Mehra  [MEH1).  In  fact,  when  input 
power  is  constrained  and  the  criterion  to  be  minimized  is  defined  by: 

J  =  -  log  (det  V-^} 

where  V  is  the  parameter  estimate  covariance  matrix  (eqn  3.4b),  it  has  been 
shown  that  the  optimal  input  for  identifying  the  parameters  of  the  weighting 
sequence  model  in  eqn  3.7  is  a  white  noise  sequence. 

Clearly  however,  the  primary  goal  of  the  identification  process  for 
applications  such  as  those  described  in  Chapter  2  is  accurate  frequency 
response  information,  not  accurate  parameter  estimates.  It  seems 
reasonable,  therefore,  to  anticipate  that  the  optimal  input  design  to  meet 
this  goal  may  differ  from  that  discussed  above.  Indeed,  recent  work  by  Yuan 
and  Ljung  [YUA1J  relies  on  an  alternative  frequency-domain  optimization 
criterion  to  produce  results  which  demonstrate  the  importance  of  using 
frequency-dependent  inputs  to  improve  frequency  response  accuracy.  Although 
the  details  of  their  work  are  beyond  the  scope  of  this  section,  their 
results  suggest  that  even  relatively  simple  input  designs  (such  as  filtered 
random  inputs)  can  be  used  to  improve  the  accuracy  of  the  frequency  response 
information  over  specified  frequency  regions.  These  results  are  particularly 
useful  when  the  critical  frequencies  of  the  system  are  known  to  the  user,  as 
it  may  then  be  possible  to  design  inputs  which  generate  a  more  accurate 
evaluation  of  system  performance  in  terms  of  gain  and  phase  margins. 

It  should  be  noted  however  that,  under  common  test  constraints  (such  as 
limitations  on  input  power),  improvements  in  accuracy  at  certain  frequencies 
are  necessarily  offset  by  degradations  at  other  frequencies.  Furthermore, 
as  shown  in  the  example  in  Section  3.3,  both  the  size  and  shape  of  the 
resulting  confidence  bounds  are  influenced  by  the  inputs  used.  These 
conditions  suggest  that  it  may  still  be  practical  to  consider  the  use  of 
white  noise  inputs  (such  as  pseudorandom  binary  or  normally-distributed 
sequences).  In  fact,  the  use  of  this  type  of  input  produces  interesting 
effects  of  its  own,  as  demonstrated  by  the  following  result: 
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Theorem  3.4:  If  V  =  a  I,  then 


where 


X  (HTVH) 
max  ' 

X  .  (HTVH) 
min  ' 


(1  ±  3/q) 
(1  -  3/q) 


_ (3.29) 


(1  -  cos(2q<|>))  j  ± 

0  =  a  -  c-3i(2»))  and  *  = 


uT 


Proof:  If  V  =  o^I,  then  H^VH  =  o'  (H^H)  and  eqn  3.29  is  simply  the  ratio 

T 

of  the  eigenvalues  of  H  H.  If  U  is  defined  as: 

1  -j 


U 


1//2 


1 


J  J 


★  T  *  T 

then  U  U  =  I  and  the  eigenvalues  of  UH  HU  and  H  H  are  identical.  Using 


T 

the  expression  for  H  in  eqn  3.9  and  recognizing  that 


q  (  1  - 
l  exp  {j2k$}  =  exp  {j2<f>}  {  — 

k=l  v 


exp{j2q«t») 


}• 


T  * 

ti  A  in  i 


exp  {j2<p) 

the  following  expression  for  UH1HU  can  be  developed: 

T  ,  f  <  «<•  «2*>  1  W 

The  eigenvalues  of  this  matrix  are  solutions  of  the  equation: 

x2 .  qx .  <i/4)  { ,2  -  1 :  ;;{:$♦»  )  -  0 

which  can  be  rewritten  as: 

\2  \  /  2  1  -  cos{2q<f>)  \  . 

X  -  qX  +  (1/4)  (q  -  ^  }  -  0 

2 

Letting  0  =  (1  -  cos  { 2q  4>} )  /  ( 1  -  cos  {2$})  and  solving  for  X  yields: 


X  =  q/2  +  g/2 
max  M 


X  .  =  q/2  -  3/2 

min  M 


Hence,  the  ratio  X  /  X  .  is  given  by  eqn  3.29. 
max  min 


.OF.n 


For  weighting  sequence  models,  the  least-squares  data  matrix  D  contains 

2 

only  system  inputs.  As  a  result,  white  noise  inputs  with  variance  ou  will 

T  2 

produce  a  matrix  (D  D)  which  tends  to  N<tuI  as  the  number  of  measurements 
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increases.  So,  the  parameter  estimate  covariance  matrix  will  be  a  constant 
diagonal  matrix.  By  Theorem  3.4  then,  the  resulting  frequency  response 
boundaries  will  be  circular  at  frequencies  corresponding  to  nonzero  integer 
multiples  of  Jl/qT.  Furthermore,  the  deviation  of  these  bounds  away  from 
circles  at  other  frequencies  can  be  predicted.  For  any  angle  A$  between 
kn/q  and  (k+l)n/q,  ^max^min  can  be  calculated  using  the  following 

expression  for  g: 

k  0  =  _ 1  -  cos(2q  A<j>) _ 

/  [1  -  a  cos(2  A$)  +  b  sin(2  A4>) ] 
a  =  cos(2itk/q)  b  =  sin(2nk/q) 

The  ratio  of  the  length  of  the  semimajor  axis  to  that  of  the  semiminor  axis 
is  then  simply  the  square  root  of  (1  +  g/q)/(l  -  g/q).  For  a  40-parameter 

model,  this  quantity  ranges  from  1.025  to  1.25  for  4.5°  <  coT  <  175.5°  [CL01] 
and  suggests  that  the  use  of  random  zero-mean  inputs  will,  indeed,  produce 
nearly  circular  bounds  over  a  wide  range  of  frequencies.  This 
characteristic  is  particularly  important  when  the  description  of  frequency 
response  uncertainty  is  extended  to  multivariable  systems  as  will  be 
discussed  in  Chapter  6. 

3.4.2  Parameter  Weighting 

A  second  approach  to  reducing  the  size  and  altering  the  shape  of  the 
desired  frequency  response  bounds  arises  from  the  fact  that  the  parameter 
estimate  ellipsoid  is  simply  an  intermediate  step  from  input/output  data  »n 
frequency  response  uncertainty.  Parameter  alterations  may,  therefore,  b« 
used  to  generate  improvements  in  the  frequency  response  bounds  provided  the 
confidence  associated  with  the  modified  parameter-space  ellipsoid  remains 
unchanged.  One  technique  to  accomplish  this  task  involves  weighting  of  the 
model  parameters  as  described  below. 

If  Q  =  AO*  is  a  chi-square  random  variable,  it  is  possible  to 


generate  confidence  bounds  for  the  quantity: 


Q  =  (i0l  U  r1)  w'1  (T1  UT  A0)  =  A0l  W'1  A0  - (3.30) 

W 

where  V  ^  is  a  positive  diagonal  matrix  and  V  has  been  rewritten  in 
eigenvalue/eigenvector  form  as  shown  in  eqn  3.11.  The  cumulative 
distribution  for  any  given  is  a  function  of  the  number  of  parameters  and 
the  specified  weights.  Although  this  distribution  can  be  identified  and 
tables  of  confidence  limits  can  be  generated  [J0H1,  J0H2,  SOLI],  it  is 
generally  not  practical  to  develop  exact  limits  for  all  possible  situations. 
To  overcome  this  problem,  Jensen  and  Solomon  [JEN1]  have  identified  a 
function  of  the  variable  Qy  that  can  be  accurately  approximated  by  a  normal 
random  variable;  thus  permitting  the  use  of  the  cumulative  standard  normal 
distribution  to  derive  appropriate  confidence  limits.  The  function  and  its 
statistics  are  described  in  the  following  lemma: 


-1  ^0 

Lemma  3.4:  [JEN1]  The  quantity  {Q  /(tr  W  )}  where 

V 


h  1  2(tr  V  l)  (tr  W~3) 

n0  "  1  -22 
u  3  (tr  V 


...(3.31) 


is  approximately  normal  with  mean  and  variance  given  by: 


U  =  1  + 


hQ  (hQ  -  1)  (tr  W  ) 

(tr  W  V 


2  2  h2  (tr  W~2 ) 

7  =  .-1  2 
(tr  u  v 


...(3.32) 


Using  Lemma  3.4,  confidence  bounds  can  be  established  as  demonstrated  in  the 
following  theorem: 


Theorem  3.5:  An  a  x  100%  confidence  boundary  tor  Q  is  given  by 

w 


1/h, 


Q  <  Q  =  (tr  V_i)  {u  +  aZ  } 

w  W  ot 

a,q 


_ (3.33) 


where  Z  is  a  known  constant  obtained  from  the  cumulative  standard  normal 
a 


distribution. 


Proof:  See  Theorem  3.2. 
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The  approximate  limits  in  eqn  3.33  have  been  compared  to  exact  limits  for  up 
to  5  parameters  [JEN1].  For  the  5  parameter  case  with  weights  ranging  from 
2.5  to  0.4,  this  comparison  indicated  errors  of  less  than  0.6%  for  a  >  0.9. 
It  also  suggested  that,  in  general,  the  approximation  improves  as  confidence 
levels  increase  and  variations  among  the  weighting  elements  decrease.  As  no 
comparisons  are  available  for  situations  involving  more  than  5  parameters, 
Monte-Carlo  simulations  were  conducted  using  the  example  in  Section  3.5  to 
verify  the  approximation  for  larger  numbers  of  parameters.  The  results  of 
these  simulations  are  presented  and  discussed  in  detail  in  Appendix  3.3, 
but,  to  summarize  here,  these  results  suggest  that  the  approximation  is, 
indeed,  accurate  for  much  larger  numbers  of  parameters. 

Using  Theorem  3.5  to  establish  appropriate  confidence  limits,  parameter 
weighting  may  be  applied  to  alter  the  shape  of  the  parameter  space  ellipsoid 
while  maintaining  the  same  level  of  confidence.  The  new  boundary  is 
described  by: 

AS*  V"1  A0  =  Q  - (3.34) 

w 

a,q 

From  Theorem  3.3,  the  frequency  response  bounds  corresponding  to  this  new 

parameter  space  bound  are  given  by: 

Ag*  (HTW  H)'1  Ag  =  Q  where  HT  =  HTU  l  - (3.35) 

W«,q 

Thus,  parameter  weighting  also  affects  the  resulting  frequency  response 
bounds,  and  this  result  may  be  used  to  generate  tighter  bounds  over 
specified  frequency  ranges.  One  particularly  useful  approach  to  accomplish 
this  task  is  to  select  the  elements  of  W  so  that  the  maximum  distance  from 
the  center  of  the  ellipse  to  its  boundary,  defined  by: 

|  Ag  |  =  /Jo  x~X  (Vu  H)]  ,  - (3.36) 

'^”max  1  w  max  /J 

a,q 

is  minimized  at  a  particular  frequency.  Because  |Ag|  is  a  nonlinear 

JIJ3X 

function  of  the  elements  of  V,  it  is  not  possible  to  obtain  an  analytic 
solution  to  this  problem.  However,  a  numerical  solution  can  be  readily 
identified  using  common  multivariable  optimization  algorithms. 
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It  must  be  noted  that  the  parameter  weights  generated  using  the 
procedure  outlined  above  are  optimal  only  at  the  single  frequency  specified 
during  the  optimization.  However,  as  suggested  by  the  example  in  Section 
3.5,  the  effects  of  this  weighting  can  be  expected  to  generate  reduced 
bounds  over  a  range  of  frequencies  surrounding  the  specified  frequency. 
These  simulation  results  also  demonstrate  that  the  optimal  weighting  tends 
to  produce  circular  bounds  over  the  same  range  of  frequencies.  Finally, 
unlike  the  design  of  modified  test  inputs,  prior  knowledge  of  the  critical 
frequency  range  is  not  required  as  parameter  weighting  may  be  applied  after 
the  identification  test  to  any  set  of  parameter  estimates.  Hence,  the 
unweighted  parameter  estimates  can  be  used  to  identify  the  critical 
frequency  range,  and  new  tighter  bounds  can  then  be  generated  using 
parameter  weighting  to  produce  a  more  effective  assessment  of  system 
stability  margins. 

3.5  Simulation  Example 

Input/output  simulations  were  used  to  demonstrate  the  concepts 
presented  in  this  chapter.  The  system  model  selected  for  the  example  was  an 
exact  40-element  weighting  sequence  whose  coefficients  are  given  by  the 
first  40  terms  in  the  impulse  response  of  the  transfer  function: 

,  x  .721  z3  -  .941  z2  +  .255  z  +  .0283 
z  (zJ  -  2.071  zL  +  1.351  z  -  .267) 

The  true  coefficients  are  displayed  in  Figure  3.1  (where  a  solid  line  has 
been  used  to  connect  the  terms).  Two  sets  of  data  (consisting  of  220 
samples  each)  were  generated  using  white  measurement  noise  with  a  standard 
deviation  of  0.05.  Test  inputs  for  Simulation  1  were  generated  as  a  zero- 
mean  pseudo-random  Gaussian  sequence  with  a  standard  deviation  of  0.45, 
while  inputs  for  Simulation  2  were  generated  as  a  pseudo-random  Gaussian 
sequence  with  the  same  statistics  but  modified  by  a  high  pass  filter 
described  by  the  transfer  function  f(z)  =  (z-l)/(z-0.6) . 
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The  input/output  information  generated  for  each  test  was  used  in  an 
off-line  least-squares  algorithm  to  estimate  the  40  coefficients  in  the 
assumed  model.  These  estimates  are  superimposed  on  the  true  weighting 
sequence  in  Figure  3.1.  Then,  using  the  procedures  described  in  this 
chapter,  90X  confidence  bounds  on  system  frequency  response  were  identified 
using  the  following  relationships: 

Chi-square:  Ag*  (HTVH)-1  Ag  <  51.80 
F:  A g*  (HTVH)-1  Ag  <  53.76 

The  chi-square  bounds  are  displayed  in  Figure  3.2  for  seven  frequencies 
between  wT  t=  1°  and  wT  =  179°.  Bounds  for  the  F-distribution  limit  differ 
only  slightly  in  size  (larger  by  1.9X)  and  are  not  displayed.  In  addition. 
Table  3.2  lists  the  maximum  distance  from  the  centre  of  the  uncertainty 
region  to  its  boundary  (i.e.  the  length  of  the  semimajor  axis  of  the 
ellipse)  at  each  frequency. 

The  results  presented  in  Figure  3.2  and  Table  3.2  clearly  demonstrate 


a)  Random  Input 
Figure  3.1: 


b)  Filtered  Random  Input 
System  (Weighting  Sequence  Parameters 
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Figure  3.2:  System  Frequency  Response  Confidence  Bounds 


Frequency 

(deg) 

Random 

input 

Filtered  random 
input 

Filtered  random 
input  with 
parameter  weighting 

1.00 

0.307 

2.387 

1.245 

2.37 

0.282 

2.225 

1.157 

5.64 

0.318 

1.478 

0.780 

13.38 

0.355 

0.750 

0.571 

31.76 

0.243 

0.268 

0.287 

75.40 

0.286 

0.241 

0.260 

179.0 

0.353 

0.283 

0.305 

Table  3.2:  Maximum  Frequency  Response  Error  Bounds 
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the  ability  of  the  procedure  developed  in  this  chapter  to  characterize  and 
quantify  the  statistical  uncertainty  associated  with  the  given  frequency 
response  estimates.  As  predicted,  the  use  of  random  inputs  produced  near¬ 
circular  frequency  response  bounds  over  most  frequencies,  while  the  use  of 
frequency-modified  inputs  improved  these  bounds  over  a  selected  range  of 
frequencies.  Indeed  for  this  example,  the  use  of  high-pass-filtered  random 
inputs  reduced  the  size  of  the  uncertainty  regions  by  from  16%  to  20%  for 
frequencies  above  <oT  =  75°.  As  these  frequencies  represent  the  critical 
frequencies  of  the  system,  this  result  demonstrates  the  importance  of  input 
selection  in  generating  tight  uncertainty  bounds  and,  ultimately,  in 
producing  a  more  effective  assessment  of  system  behaviour.  However,  it 
should  be  noted  that  the  boundary  reductions  at  high  frequency  were  achieved 
at  the  expense  of  significantly  larger  and  more  elliptical  bounds  over  a 
range  of  low  frequencies.  Although  these  low  frequency  changes  do  not 
affect  the  evaluation  of  gain  and  phase  margins  in  this  example,  they  do 
suggest  that  the  indiscriminate  application  of  frequency-modified  inputs  can 
lead  to  erroneous  assessments  of  system  performance  based  on  bounds  at 
critical  frequencies  which  are  generated  in  conjunction  with  unacceptably 
large  bounds  at  other  frequencies. 

In  addition  to  demonstrating  the  effects  of  test  input  selection,  the 
results  from  Simulation  2  (the  test  using  high-pass-filtered  inputs)  were 
also  used  to  examine  the  effects  of  parameter  weighting  on  the  size  and 
shape  of  the  frequency  response  bounds.  In  this  case,  parameter  weights 
were  generated  to  minimize  the  maximum  error  bound  at  «T  =  5.64°  using  the 
procedure  described  in  Section  3.4.2.  The  set  of  frequency  response  bounds 
produced  after  implementing  the  optimal  weighting  are  displayed  in  Fig.  3.3 
along  with  a  comparison  of  the  before-  and  after-weighting  bounds  at 
<oT  =  5.64°.  In  addition,  the  new  maximum  error  bound  at  each  frequency  is 
listed  in  Table  3.2.  The  effects  of  parameter  weighting  in  this  example  are 
dramatic.  Over  frequencies  from  wT  =  1°  to  wT  =  15°,  the  maximum  error 
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bounds  have  been  reduced  significantly  (by  as  much  as  40X  for  frequencies 

less  than  wT  =  6°)  at  the  expense  of  only  minimal  increases  at  frequencies 

much  larger  than  »T  =  5.64°.  Indeed,  much  of  the  high  frequency  boundary 
reduction  initially  achieved  by  using  frequency-modified  inputs  has  been 
preserved  despite  the  application  of  parameter  weighting  at  a  significantly 
lover  frequency.  Furthermore,  the  elliptical  bound  at  wT  =  5.64°  is  now 
circular.  It  should,  however,  be  noted  that  wT  =  5.64°  was  selected 
primarily  to  demonstrate  the  effects  of  parameter  weighting.  Because  the 
critical  frequencies  for  this  system  are  significantly  larger  than 
<oT  =  5.64°,  practical  applications  of  parameter  weighting  may,  perhaps,  have 
been  focused  instead  on  achieving  larger  improvements  in  the  intermediate 
frequency  range  from  25°  to  60°  at  the  expense  of  such  dramatic  improvements 
at  low  frequencies.  Nevertheless,  the  simulation  results  presented  here 
clearly  demonstrate  the  usefulness  of  parameter  weighting  in  reducing 
boundary  size  and  altering  boundary  shape  at  specific  frequencies. 
Potential  improvements  in  the  assessment  of  system  behaviour  are  apparent. 

x  true 

frequency 

Imag  response  Imag 


Figure  3.3:  Frequency  Response  Confidence  Bounds  after  Parameter  Weighting 
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Appendix  3.1:  The  Peizer-Pratt  Approximation  for  F  Statistics  [ PEI 1 ] 

Let  F  denote  an  F-distributed  random  variable  with  k  numerator  degrees 
of  freedom  and  m  denominator  degrees  of  freedom,  and  let  F.  denote  a 

K.  f  HI 

known  upper  bound  on  F.  Then  P{F  <  F,  }=ax  100%,  and  the  confidence 

"  K.  y  m 

level,  a,  can  be  identified  using  the  following  procedure: 

Define: 

S  =  1/2  {m-lj  ;  n  =  1/2  {k+m}  -  1  ;  p  =  kp  m 

k,m 

T  =  1/2  {k-1}  ;  q  =  1  -  p 

Then  compute: 


d 


2 


dj  =  S  +  (1/6)  -  (n  +  [ 1/3 J )p 
=  dl  +  °-02  {  (s  +q0.5)  "  (T  =P0.5)  + 


Z  _  d  /  1  *  q  f{S/np)  +  p  f(T/nq) 
a2  \  (n  +  [1/6]  p  q 


(q  -  0-5) 

(n  +  1) 
1/2 


} 


where  f{x}  =  - - — +  ^  —  and  Z  denotes  the  standard  normal 

(i  -  *r 

confidence  limit  that  defines  a  for  the  given  bound,  F.  . 


Appendix  3.2:  A  Matrix  Identity 

T 

Given  a  full-rank  matrix  H  [2  x  qj,  it  is  always  possible  to  find  a 

T 

full-rank  matrix  M  [q  x  (q-2)|  such  that  H  M  =  0.  Furthermore,  the  inverse 


of  the  partitioned  matrix  A  = 


is  given  by  A  1  =  [H(HTH)  1|  M(MTM)  1]. 


This  can  be  verified  as  follows: 


T  1 
H1 

[H(HTH)  M(MTM)  J)  = 

(HTH)  (HTH)_1  j 

0 

T 

M1  J 

0  1 

(MTM)  (MTM)_1  . 

Since  AA  I  =  A~*A,  the  following  relationship  can  also  be  established: 


(H(HTH)_1|  M(MTM)-1] 


L  M‘ 


T  -1  T  T  -1  T 

H(frH)  +  M(M1M)  1M1  =  I. 


Rearranging  this  result  yields:  M(MTM)~1MT  =  I  -  H(HTH)-1HT. 
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Appendix  3.3:  Verification  of  Weighted-Parameter  Confidence  Bounds 

As  described  in  Section  3.5,  simulations  were  produced  using  a  40- 
parameter  weighting  sequence  to  generate  system  input/output  data.  An  off¬ 
line  least-squares  algorithm  was  used  to  estimate  the  model  parameters,  and 
the  parameter  estimates  and  associated  covariance  matrix  were  used  to 
produce  frequency  response  confidence  regions.  Using  this  same  information, 
parameter  weights  may  be  identified  to  minimize  the  maximum  frequency 
response  uncertainty  bound  at  any  specified  frequency  as  described  in 
Section  3.4.2.  For  a  given  set  of  optimal  weights,  an  approximate  a  x  1002! 

confidence  region  for  Q  (eqn  3.30)  is  given  by: 

w 

Q  =  AO*  V*  A0  <  Q  _ (A3. 1) 

w  opt  w 

a,q 

where  Q  is  defined  by  eqn  3.33. 

V 

a,q 

By  definition,  the  elements  of  A9  are  standard  normal  random  variables. 

Hence,  eqn  A3.1  can  be  used  to  verify  the  accuracy  of  Q  using  Monte 

Wa,q 

Carlo  simulations.  To  accomplish  this  task,  optimal  parameter  weights  were 

identified  at  three  different  frequencies  (wT  =  5.64°,  75.4°,  and  179.0°) 

using  the  information  from  Simulation  2  in  Section  3.5.  For  these  optimal 

weights,  100  trials  were  generated  at  each  frequency  to  investigate  the 

accuracy  of  Q  for  a  =  0.974.  Each  individual  trial  consisted  of  6000 

Wa,  q 

distinct  test  samples  in  which  the  40  elements  of  A9  were  generated  (using  a 


random  number  generator)  as  independent  normal  random  variables  with  zero 

mean  and  unit  variance  and  the  quantity  A9f  A9  was  calculated  and 

compared  to  Q  .A  test  sample  was  judged  to  be  a  success  if 

w. 974, 40 

AO*  W~*  A0  <  Q  and  a  failure  if  A9t  W~\  A9  >  Q  ,  and  the 

°pt  '  W. 974, 40  opt  w. 974, 40 

total  number  of  successes  (out  of  6000  samples)  for  each  trial  was  recorded. 


A  statistical  summary  of  the  results  of  the  Monte  Carlo  simulations 


described  above  is  presented  in  Table  A3.1.  For  a  =  0.974,  the  anticipated 
number  of  successes  for  any  one  trial  (6000  test  samples)  is  5844.  As  shown 
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in  the  table,  this  value  compares  quite  favourably  with  the  average  number 
of  successes  recorded  for  the  100  trials  at  each  frequency.  Indeed,  the 
difference  between  the  average  success  rate  from  the  simulations  and  the 
desired  rate  of  0.974  ranged  from  0.08%  to  0.28%  for  the  three  frequencies 
examined.  These  differences  agree  closely  with  similar  results  produced  by 
Jensen  and  Solomon  [JEN1]  for  much  smaller  numbers  of  parameters  and  suggest 
that  the  normal  approximation  used  to  generate  the  parameter-weighted 
confidence  bound,  Q  ,  is  accurate  for  much  larger  parameter  sets. 


Frequency 

Boundary  Failures 

Average 

for  Parameter 

Per  Trial 

Success  Rate 

Optimization 

(deg) 

Mean 

Standard 

Deviation 

(6000  -  Meanl 
\  6000  / 

5.64 

5860.3 

11.3 

0.9767 

75.40 

5848.8 

12.2 

0.9748 

179.0 

5859.0 

11.9 

0.9765 

Table 

A3.1:  Monte 

-Carlo  Simulation  Results 

63 


CHAPTER  FOUR 


A  GEOMETRIC  FREQUENCY  RESPONSE -BASED  TRUNCATION  CRITERION 

The  results  in  Chapter  3  establish  a  statistical  description  of  the 
uncertainty  associated  with  the  estimated  frequency  response  for  any  given 
system.  This  description  is  obtained  using  a  parametric  approach  to  the 
system  identification  problem,  and  a  key  ingredient  in  the  development  is 
the  selection  of  finite  weighting  sequence  models  to  describe  system 
dynamics.  Since  most  systems  are  more  realistically  described  by  infinite 
weighting  sequences,  any  practical  implementation  of  the  procedure  to 
accurately  quantify  frequency  response  uncertainty  must  necessarily  rely  on 
an  effective  truncation  of  the  true  weighting  sequence. 

The  truncation  problem  described  here  is,  in  essence,  a  special  case  of 
the  general  model  order  selection  problem  (a  problem  that  has  received  wide 
attention  in  the  literature).  It  seems  reasonable,  therefore,  to  expect 
that  any  of  the  most  widely-used  criteria  can  be  introduced  to  solve  this 
special  problem.  However  now,  the  primary  goal  of  the  identification 
process  is  to  obtain  accurate  frequency  response  information.  Since  the 
selected  model  serves  only  as  a  vehicle  for  generating  this  information,  an 
appropriate  truncation  criterion  should  concentrate  on  frequency  response 
considerations.  Unfortunately,  the  available  order  selection  criteria  focus 
solely  on  the  problem  of  generating  accurate  parametric  (time-domain) 
descriptions  of  the  system.  Hence,  they  are  not  well  suited  to  this 
frequency-domain  problem. 

Existing  results  do,  however,  offer  valuable  insights  into  the 
development  of  an  appropriate  frequency  response-based  criterion.  In  this 
chapter,  a  geometric  interpretation  of  the  time-domain  order  selection 
problem  is  used  to  highlight  important  concepts  for  the  development  of  an 
alternative  frequency-dependent  criterion.  The  geometric  relationships 
developed  from  this  analysis  are  extended  to  the  complex  Nyquist  plane  to 
produce  the  desired  frequency  response-based  criterion.  The  resulting 
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criterion  is  easy  to  implement  and,  as  demonstrated  by  Monte  Carlo 

simulations,  identifies  the  correct  dependence  of  truncation  level  on 
frequency.  Thus,  for  specified  individual  frequencies,  the  criterion 
identifies  the  proper  finite-length  model  required  to  produce  accurate 
statistical  bounds  on  system  frequency  response. 

4.1  The  Time-Domain  Order  Selection  Problem:  A  Geometric  Perspective 

A  common  goal  of  model  order  selection  is  to  establish  an  appropriate 
trade-off  between  variability  in  the  parameter  estimates  and  quality  of  fit 
to  the  observed  output  data.  Indeed,  many  order  selection  criteria  have 
been  proposed  to  achieve  this  goal  through  the  use  of  statistical 

information  on  the  parameter  estimates  [ FRE1 ] .  However,  using  geometrical 
interpretations  of  Akaike's  AIC  criterion  [ AKAl ] ,  it  is  possible  to 

establish  an  alternative  criterion  which  achieves  the  same  goal  and  yet  is 
amenable  to  extension  for  frequency  response  purposes. 

Under  the  assumption  that  the  "best"  model  is  smaller  in  dimension  than 
the  true  model,  system  output  (as  defined  by  eqn  3.2)  can  be  rewritten  in 
the  following  way: 

y  =  y°+e  =  D  9°  +  e  =  D  9  +  D  9  +  e  - (4.1) 

q  q  q  q 

o  N  Q 

where  y  =  D  9°  e  R  is  the  true  output;  9°  e  R  °  is  the  set  of  true 

parameters;  9^  denotes  the  first  q  elements  of  9°;  9^  denotes  the  elements 
of  9°  not  included  in  9  ;  and  D  and  D  are  matrices  defined  by  partitioning 

q  q  q  J  1 

D  appropriately.  For  any  given  q,  y°  can  be  estimated  by  y  =  D  9  where  9 

q  q  q 

represents  the  least-squares  estimate  defined  by  eqn  3.3.  Given  this 
relationship  between  system  output  and  model  parameters,  it  seems  reasonable 
to  suggest  that  the  order  selection  process  should  focus  on  identifying  the 
model  which  most  closely  matches  the  estimated  output,  y,  to  the  true 
output,  y°;  a  goal  achieved  by  selecting  the  order  which  minimizes 

J  =  | | y°  -  yll^.  Since  y  is  random,  a  practical  alternative  cost  function 
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is  given  by: 


E  { Jy }  =  E{  I  |y°  -  y  1 1 2 }  =  |  |y°  -  y*  1 1 2  +  E{||y*  -  y  1 1 2 }  ....(/■•?) 

where  y  =  P  y°=  D  6*  is  the  orthogonal  projection  of  y°  onto  the  q- 

N  TIT 

dimensional  subspace  of  k  spanned  by  the  columns  of  D  ,  P  =  D  (DlD  )-iDl 

q  q  q  q  q  q 


is  the  orthogonal  projection  operator  onto  the  range  space  of  D^,  and 


0q  =  E{0q}.  Indeed,  this  formulation  of  the  problem  not  only  ensures  that 

the  selected  model  will,  on  average,  minimize  | |y°  -  y | | 2 ,  but  it  also 
implicitly  establishes  a  trade-off  between  variability  (as  measured  by 
E{ | | y*  -  y | | 2 } )  and  bias  (||y°  -  y*  |  | 2 )  - 

A  solution  to  the  problem  posed  above  clearly  relies  on  the  ability  to 
identify  the  bias  and  variability  terms  associated  with  each  model  order. 
To  visualize  the  development  of  appropriate  estimates  for  these  quantities, 
consider  the  simplified  case  when  N  =  3  and  q  =  2  as  displayed  in  Figure 


4.1  (where  the  directed  line  segments  BA,  DB,  and  AC  denote  the  vectors 

*  -k 

(yQ  -  y  ),  (y  -  y),  and  the  true  noise  e,  respectively).  [Note:  Although 

the  following  discussion  makes  reference  to  this  diagram,  the  results 
derived  are  valid  for  arbitrary  N  and  q.]  When  z  is  Gaussian  white  noise, 

E{e  e)  =  | | AC | |  =  No^.  Since  (y  -  y  )  =  BD  is  the  orthogonal  projection  of 

z  onto  the  specified  q-dimensional  subspace,  it  can,  therefore,  be  shown 

*  *  2  2 

that  E { | | y  —  y | |  }  =  qa^.  The  bias,  however,  is  more  difficult  to 
estimate  because  both  0°  and  0*  are  unknown.  An  accurate  estimate  can  hr- 

q 

O  ^  ' 

generated  in  the  following  way.  By  adding  and  subtracting  y,  | |y  —  y  |  | 
can  be  written  as: 

lly°  -  y*  1 1 2  =  lly°  -  y  1 1 2  +  2  (y°  -  y/(y  -  y*)  +  I  |y  -  y*  1 1 2  — (*-3) 

Note  that  eqn  4.3  is  obtained  by  rewriting  the  vector  BA  in  Figure  4.1  as 

the  sum  of  BC  and  CA.  By  definition,  | |y°  -  y | | 2  =  ||CA||2  =  6*6  and  the 
two  remaining  terms  in  eqn  4.3  can  be  rewritten  in  the  following  form: 


- (4.4a) 


2  (y°  -  y)l(y  -  y*)  =  -2  e^i-P^  9q  -  2ete 

1 1 y  -  y  ll2=  1 1 y  -  y 1 1 2  +  2  (y  -  y)r(y  -  y*)  +  | |y  -  y*||2  - (4.4b) 

where  eqn  4.4a  is  derived  from  the  definitions  of  y,  y°,  and  y  ,  and  eqn 

4.4b  is  generated  by  replacing  BC  in  Figure  4.1  by  the  vector  sum  of  BD  and 

DC.  Since  E{c}  =  E{y  -  y  )  =  0,  -2£t(I-P^)D^0^  and  2(y  -  y)l(y  -  y*)  will, 
on  average,  be  zero  and,  hence,  these  terms  can  be  neglected.  As  discussed 

earlier,  E{||y  —  y  |  |  }  =  ||BD|1  =  qa^..  So  for  any  given  test,  the  best 

*2  2  2 

available  estimate  of  j jy  -  y  j|  is  qs  where  s  is  defined  by  Lemma  3.2 

2  _  2  '  t  * 

(eqn  3.14).  In  addition,  |  |y  -  y| |  =  | | DC | |  =  €  c  is  simply  the  sum  of  the 

t;  _  2 

squares  of  the  residuals  of  the  given  data  fit.  Finally,  since  c  e  =  ||AC|| 
is  independent  of  model  order,  it  is  constant  for  all  specified  models. 
Thus,  | |y°  —  y  | | 2  may  be  accurately  estimated  by  +  qs2  +  constant. 
Rewriting  eqn  4.2  in  terms  of  the  expressions  derived  above  yields: 


B 


Figure  4.1:  A  Three-Dimensional  Perspective 
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*  t  *  2 

E{Jy}  =  e  e  +  2qs  +  constant, 

and  the  correct  model  order  can  be  selected  using  the  following  criterion: 

Criterion  4.1:  Select  the  model  order  q  which  minimizes 

J  =  ete  +  2qs2  - (4.5) 

over  all  model  orders,  where  t  =  (y  -  D  0  )  is  the  vector  of  residuals 

q  q' 

for  the  given  parameter  estimates. 

It  is  interesting  to  note  that,  for  q  <<  N,  this  criterion  can  be  shown  (on 
the  basis  of  expressions  developed  in  Chapter  5)  to  yield  the  same  result  as 
practical  implementations  of  Akaike's  AIC  criterion. 

4.2  Adaptations  for  Frequency  Response  Applications 

When  frequency  response  estimation  is  the  goal,  the  output 
relationships  described  above  are  no  longer  relevant.  Instead,  a  more 
realistic  objective  is  to  minimize  the  distance  between  the  estimated 
frequency  response,  g,  and  the  true  frequency  response,  g°,  at  some 
specified  frequency,  wT.  Since  g  is  random,  an  appropriate  measure  of  this 
distance  is  given  by: 

Jg  =  E{||g°  -  g||2}  =  Mg0  -  g*||2  +  E{ | | g*  -  g||2}  ....(4.6) 

where  g*=  E{g}.  This  formulation  again  highlights  the  bias/variability 

tradeoff;  a  tradeoff  that  has  been  discussed  in  general  frequency  response 

terms  by  Ljung  and  Yuan  [ LJU1 ] .  To  establish  an  appropriate  order  selection 

criterion  however,  one  needs  accurate  estimates  of  both  quantities. 

For  weighting  sequence  models,  appropriate  estimates  may  be  obtained 

» 

using  an  approach  analogous  to  that  of  the  previous  section.  A  key 
ingredient  in  this  development  is  the  linear  relationship  between  the  model 
parameters  and  system  frequency  response  described  by  eqn  3.9.  First, 
consider  E{||g  -  g| |  }.  Since  g  =  9^  and  g  =  H*  0^,  this  quantity  can  be 

written  as: 
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E{||g*  -  g||2}  =  tr  [Hj  E{(9*  -  0q)(9*  -  eq)t}  Hq i 

By  definition,  0  -  9*  =  (DTD  )_1DTe.  So,  E{(0*-  0  )(0*-  0  )*}  =  a2(DTD  )~X 
q  q  q  q  q  1  q  q'v  q  q'  e  q  q 

and  the  expression  above  reduces  to: 


E{| |g*  -  gl I2}  =  tr  [H^D^)  1Hq] 

Next,  consider  j |g°  -  g  ||2.  To  estimate  this  quantity,  define 

g  =  HT  DL  y 
sm  o 


L  T  -IT 

where  D  =  (D  D  )  D  is  the  left  inverse  of  D  and  the  dimensions  of  D  and  H 


....(4.7) 

....(4.8) 
T 


are  (N  x  qQ)  and  (2  x  qQ)  respectively.  The  bias  can  then  be  written  as: 


o  *\,2 


I  |g  -  g 


g°-  gjl2  +  2  (g°-  gm)t(gm-  g*)  +  ||g  -  g*||2  - (4.9) 


’m'  °m 


Using  the  relationships  for  y  and  gm  given  by  eqns  4.1  and  4.8,  the  three 
terms  above  can  be  rewritten  as: 

l|g°  -  gj|2=  l|H*  DL  e|  | 2 


2  (g°  -  g  )t(g  -  g*)  =  -2  e^D1)1  H  {HT  0  -  HT  0*)  -  2  |  |HT  DL  e|  |2 
o/  v/o  ooqq  1  1  o  ' 


I  |gm  -  g*|[2  =  llgm  -  g I  1 2  +  2(g  -  g  )r(g  -  g*)  +  | |g  - 


*,  1 2 


m 


Since  E{e)  =  E{g  -  g*}  =  0,  -2et(DL)THo(H^  0q-  hJ  0*}  and  (gm  -  g)t(g  -  g*) 
will,  on  average,  be  zero  and,  hence,  these  terms  can  be  neglected.  With 
these  simplifications,  substitution  of  the  above  expressions  (together  with 
eqn  4.7)  into  eqn  4.9  yields  the  following  result: 

llg°  -  g*H2  =  Ilgm  -  gl  1 2  +  s2  tr  Hq(DqDq)_1Hq  -  ||H*  DL  e||2  ....(4.10) 

T  L  2 

Now,  combining  eqns  4.6,  4.7  and  4.10  and  recognizing  that  | | Ho  D  e| |  is 

constant  for  all  q  produces: 

J  =  Itg  —  g I  1 2  +  2s2  tr  H^(D^D  )  +  constant 

g  11 6m  6,1  q  q  q  q 

Hence,  the  correct  order  can  be  selected  using  the  following  criterion: 


Criterion  4.2:  Select  the  model  order  q  which  minimizes 

Jg  =  ||gm  -  gl  I2  -  2s2  trlH^D^)-^]  ....(4.11) 
over  all  model  orders,  where  Hq  is  calculated  at  the  specified  frequency 
of  interest. 


69 


To  identify  in  eqn  4.11,  gm  must  be  calculated  using  the  relationship 
defined  by  eqn  4.8.  This  implies  that  qQ  must  be  known.  But  for  weighting 
sequences,  qQ  is  theoretically  infinite.  In  practice,  one  can  choose  an 
appropriately  large  finite  value  to  obtain  accurate  results.  Indeed,  the 
simulation  results  in  the  next  section  demonstrate  that  the  chosen 
truncation  level  is  insensitive  to  this  specification  provided  qQ  is 
sufficiently  large. 


4.3  Monte  Carlo  Simulation  Results 

Input/output  simulations  were  performed  using  two  systems  described  by 
the  transfer  functions: 


/_*  .75 

gSl(z)  ”  z  -  .85 


gS2(z) 


.5  (z  +  .7) 


z  -  1.65z  +  .68 

For  each  simulation,  random  inputs  with  unit  variance  were  used  to  generate 

N  =  1000  output  samples,  each  corrupted  by  Gaussian  noise  with  variance 
2 

=  0.0625.  Criterion  4.2  was  implemented  using  various  values  for  qQ,  and 

the  resulting  frequency-dependent  truncation  levels  (along  with  the 

truncation  which  minimized  the  true  value  of  | |g°  -  g||^)  were  observed. 

The  procedure  above  was  repeated  100  times  for  each  system  to  produce  a 

representative  sample,  and  the  resulting  sample  statistics  are  presented  in 

Table  4.1.  The  results  obtained  using  Criterion  4.2  (denoted  by  J  in  the 

S 

table)  can  be  seen  to  give  good  agreement  with  the  true  optimal  truncation 
levels  (denoted  by  J  =  ||g°  -  g | | ^  in  the  table).  Furthermore,  the  results 
suggest  that  the  level  of  truncation  selected  is  insensitive  to  the  value  ”f 
qQ  used,  provided  qQ  is  chosen  to  be  sufficiently  large.  More  important lv. 
unlike  standard  order  selection  criteria  IFRE1J,  Criterion  4.2  does  not 
identify  a  single  best  order,  but  rather,  in  agreement  with  the  true 
results,  establishes  the  correct  dependence  of  truncation  level  on 
frequency.  For  comparison,  the  commonly-used  interpretation  (defined  by  the 
cost  function  JA  =  N  lnCe^e)  +  2q)  of  Akaike's  AIC  criterion  as  well  as 
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Bhansali  and  Downham's  modified  AIC  criterion  (defined  by  the  cost  function 
Jg  =  N  ln(e':e)  +  4q)  were  applied  to  the  same  test  data,  and  the  following 
results  (average  truncation  level  ±  1  standard  deviation)  were  obtained: 


q  .  for  J. 
Mopt  A 


q  „  for  J_ 
2opt _ B 


SI: 

27.4 

±  2.6 

24.4  ± 

1.8 

S2: 

32.7 

±  4.5 

28.3  + 

1.7 

A  comparison  of  these  values  with  the  true  results  in  Table  4.1  demonstrates 
that  the  single  truncation  levels  obtained  using  these  "parameter  space" 
criteria  do  not  accurately  reflect  the  true  optimal  trade-off  between  bias 
and  variability  of  the  frequency  response  estimates;  yielding  results  which 
do  not  properly  account  for  bias  at  low  frequencies  and  which  produce 
unnecessarily  conservative  confidence  bounds  at  high  frequency.  For 
frequency  response  applications  therefore,  Criterion  4.2  can  be  seen  to 
produce  results  which  are  superior  to  those  of  other  commonly-used  order 
selection  criteria. 


wT 


(deg) 


V  for 

J=l |g°-  g| I2 


q  .  for 
Mopt 

J  (q  =100) 
g  Mo 


q  .  for 
Mopt 

J  (q  =125) 
g  VMo 


%pt  f°r 

Jg  <v150) 


0  35.9  ±  6.7  32.3  ±  7.1 

SI:  15  33.2  +  7.5  28.5  ±  6.3 

45  23.5  +  3.7  19.0  ±  3.0 


32.7  ±  7.5 
28.6  x  7.2 
18.3  ±  3.5 


32.1  ±  8.4 
28.7  ±  7.3 
17.4  ±  3.7 


0 

S2:  15 

45 


39.8  ±  7.4 

36.5  ±  7.3 

25.6  ±  3.4 


35.7  ±  7.5 
32.6  +  6.7 
21.3  ±  3.8 


35.0  ±  7.6 

31.8  ±  6.7 

20.8  +  4.0 


34.6  ±  8.3 
30.3  ±  7.2 

18.9  +  3.8 


Table  4.1:  Statistical  Summary  of  Selected  Truncations  from  Simulation 

(Average  Value  +  1  Standard  Deviation) 
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CHAPTER  FIVE 


OPTIMAL  TRUNCATION  WITH  BTAS  TnENTTPICATION 

As  discussed  previously,  the  use  of  finite  weighting  sequences  to 
describe  system  dynamics  imposes  a  practical  requirement  to  establish  an 
appropriate  level  of  truncation.  Furthermore,  our  desire  to  accurately 
characterize  frequency  response  uncertainty  suggests  that  any  criterion  used 
to  establish  the  truncation  should  incorporate  both  the  available 
statistical  information  from  the  identification  process  and  appropriate 
frequency  response  information.  The  geometric  criterion  proposed  in  Chapter 
4  offers  one  means  of  accomplishing  this  task.  This  criterion  leads  to  an 
easy-to-implement  procedure  that  incorporates  frequency  response  information 
at  a  single  specified  frequency  (via  the  linear  transformation  defined  by 
eqn  3.9)  to  generate  appropriate  frequency-dependent  truncation  levels.  It 
is,  however,  frequency-specific.  As  such,  the  resulting  truncation  (when 
used  to  develop  the  statistical  description  of  uncertainty  described  in 
Chapter  3)  yields  an  accurate  description  of  frequency  response  uncertainty 
over  only  a  small  range  of  frequencies  in  the  vicinity  of  the  selected 
frequency.  Indeed,  because  the  criterion  does  not  account  for  bias  at  other 
frequencies,  the  statistical  bounds  derived  from  the  selected  model  at  these 
other  frequencies  may  either  be  unnecessarily  conservative  or  may  actually 
fail  to  include  the  true  frequency  response  of  the  system.  As  a  result, 
though  the  uncertainty  results  at  the  specified  frequency  are  valid  and 
accurate,  it  may  not  be  possible  to  generate  accurate  bounds  at  other 
frequencies.  Hence,  it  may  not  be  possible  to  describe  accurately  the 
interdependence  of  the  frequency  response  estimates. 

To  retain  the  characterization  of  interfrequency  dependence  that  is 
inherent  in  the  statistical  bounds  established  by  Theorem  3.3  and  to  ensure 
that  these  bounds  produce  an  accurate  description  of  uncertainty,  other 
methods  of  identifying  the  proper  truncation  must  be  investigated.  Again, 
commonly-used  parameter-space  criteria  provide  a  useful  starting  point  for 
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this  investigation.  Practical  implementations  of  these  criteria,  as 
suggested  in  the  previous  chapter,  are  not  generally  well  suited  to  this 
specific  problem  for  a  number  of  reasons  including  their  failure  to  include 
frequency  response  information  and  their  inability  to  quantify  the  bias 
introduced  by  the  selected  model.  However,  the  theoretical  foundations  of 
these  criteria  do  provide  valuable  insights  into  the  solution  of  the 
problem.  More  specifically,  the  statistical  characteristics  associated  with 
Akaike's  information  theoretic  criterion  [AKA1|,  when  combined  with  certain 
unique  characteristics  of  weighting  sequence  models,  establish  a  framework 
fot  the  development  of  a  new  truncation  criterion  to  accomplish  the  desired 
tasks. 

The  goal  of  this  chapter  is  to  develop  an  alternative  criterion  (using 
the  framework  mentioned  above)  which  combines  information  on  the  variability 
oi  the  estimates  and  the  bias  introduced  by  truncation  to  establish  an 
optimal  level  of  truncation.  It  will  be  shown  that  this  criterion  not  only 
identifies  the  truncation  which  produces  an  optimal  statistical  trade-off 
between  variability  and  bias,  but  that  it  also  establishes  an  explicit  upper 
bound  on  the  bias  introduced  by  this  truncation.  Furthermore,  this  bias 
bound  can  be  generated  in  either  the  parameter  space  or  the  frequency  domain 
to  quantify  either  "parameter"  bias  or  frequency  response  bias.  The  chapter 
begins  by  developing  the  proposed  criterion  in  the  parameter  space  and  then 
extends  the  results  to  establish  an  appropriate  frequency-domain  alternative 
which  provides  the  remaining  information  needed  to  completely  quantify 
frequency  response  uncertainty. 

5.1  Optimal  Truncation:  Preliminary  Developments 

5.1.1  Akaike's  Criterion  and  Associated  Implementation  Problems 

For  systems  described  by  the  input-output  relationship  in  eqn  3.1,  the 
development  of  an  appropriate  system  description  relies  not  only  on  one's 
ability  to  generate  parameter  estimates  (for  a  given  model)  which  provide 
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the  best  fit  to  the  observed  data,  but  also  on  one's  ability  to  select  the 
proper  model  from  a  specified  class  of  models.  One  widely-used  criterion  to 
achieve  this  second  objective  is  the  information  theoretic  criterion 
proposed  by  Akaike  [AKA1J.  Using  this  statistical  criterion,  the  "best" 
model  for  the  system  can  be  defined  by: 

Definition  5.1:  For  the  system  described  by  eqn  3.1  and  any  specified  set 
of  input/output  data,  the  "best"  model  for  the  system  is  the  model 
identified  by  minimizing: 

AIC  =  -2  ln{maximum  likelihood}  +  2  {number  of  free  parameters} 

or,  for  Gaussian  white  noise, 

AIC  =  N  In  {ff2E}  +  2  q  _ (5.1) 

where  N  is  the  number  of  output  measurements,  q  is  the  number  of 

‘2 

parameters  in  the  assumed  model  (i.e.  the  dimension  of  the  model),  and  a 

2 

represents  the  maximum  likelihood  estimate  of  a £  obtained  from  the 
residuals  of  the  data  fit. 

A  key  element  in  the  development  of  minimum  AIC  as  an  appropriate  criterion 

for  model  selection  is  the  assumption  that  the  class  of  models  from  which 

the  "best"  model  is  to  be  selected  does  not  contain  the  true  model,  but 

instead  contains  only  models  of  lower  order  which  are  "sufficiently  close" 

to  it  [AKA1].  This  assumption  necessarily  implies  that  the  residuals 

obtained  for  any  estimated  model  in  the  class  include  a  non-random  component 

resulting  from  the  difference  in  order  between  the  true  and  estimated 

models.  This  non-random  element  poses  a  problem  for  practical 

implementations  of  the  minimum  AIC  criterion. 

For  any  model  in  the  specified  class,  the  true  maximum  likelihood 
2 

estimate  of  is  given  by: 

o2z  =  {e  -  u}f  {t  -  p}  /  N 

where  c  denotes  the  residuals  generated  by  the  estimated  model  and  u 
^■presents  the  non-random  component.  Using  this  relationship,  AIC  can  be 
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rewritten  as: 


AIC  =  N  ln{[e  -  v»  ]  *"  [  e  -  w  J  /N }  ■»  2q  =  N  ln{[f:  -  uJ^e  -  u  ] }  -  N  In  N  +  2q 
Since  N  In  N  is  constant  for  any  given  set  of  data,  it  can  be  ignored  when 
minimizing  AIC,  and  an  alternative  form  of  the  criterion  is  given  by: 

AIC  =  N  ln{je  -  yj^e  -  y]}  +  2q  ....(5.2) 

But  y  is  unknown  and,  hence,  implementation  of  the  criterion  originally 
proposed  by  Akaike  cannot  be  accomplished  in  practice. 

For  practical  purposes  however,  an  effective  criterion  can  still  be 
established  provided  a  reliable  estimate  of  AIC  is  available.  One  commonly- 
used  estimate  is  obtained  by  neglecting  the  non-random  element  to  produce: 

AIC  =  N  In  {£l  e}  +  2q  (5.3) 

Indeed,  the  assumption  that  all  models  in  the  specified  class  are 

"sufficiently  close"  to  the  true  model  appears  to  justify  the  further 

assumption  that  y  can  be  neglected.  However  when  the  statistical 

perspective  originally  taken  by  Akaike  is  considered,  this  assumption  is  not 

justified.  When  several  models  are  sufficiently  close  to  the  true  one,  the 

best  model  (from  a  statistical  "distribution-matching"  point  of  view)  is  the 

one  for  which  the  estimated  variance  obtained  from  the  residuals  most 

closely  matches  the  true  variance  of  the  noise  present  in  the  data.  Minimum 

2 

AIC  (using  the  true  maximum  likelihood  estimate  of  o^)  accomplishes  this 

task.  On  the  other  hand,  AIC  introduces  additional  terms  associated  with 

the  neglected  element  which  can  corrupt  the  desired  information  and  lead  to 

an  incorrect  selection  of  model  order.  Indeed,  Monte  Carlo  simulations 
presented  in  (BHAlj  and  [ FRE1 J  clearly  demonstrate  the  inadequacy  of  minimum 
AIC  as  an  effective  criterion  for  model  selection. 

5.1.2  Incremental  Adjustments  to  Improve  Order  Selection  Accuracy 

As  suggested  above,  AIC  is  not  accurate  enough  to  be  used  in  place  of 
AIC  to  establish  the  correct  model  order.  It  is,  however,  possible  to 
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identify  the  inaccuracies  more  clearly  by  examining  the  incremental 
differences  between  AIC  and  AIC  as  shown  below.  The  results  of  this 


comparison  can  be  used  to  suggest  adjustments  which,  when  included  in  AIC, 
will  reproduce  the  true  value  of  AIC.  These  adjustments  prove  to  be 
particularly  helpful  when  weighting  sequence  truncation  is  investigated. 

Consider  a  system  described  by  the  standard  input/output  relationship 
in  eqn  3.2.  As  shown  in  Section  4.1,  this  relationship  can  be  rewritten  in 
the  following  way: 


e  +  d  e  + 
q  q 


£ 


D  9  + 

q  q 


e 


q 


where  e  =  D  9  +  e.  Now,  the  least-squares  solution  for  9  is: 

q  q  q  4  q 

9  =  (DT  D  )-1  DT  y  - (5.4) 

q  q  q  q  y 

and  the  residuals  of  the  fit  can  be  calculated  by: 
e  =  y  -  D  9  =  y  -  D  (DT  D  )_1  DT  y 

q  qq  q  qq  q 

=  (D  9  +  D  e  +  e)  -  (D  9  +  P  D  9  -  P  e) 

qq  qq  qq  qqq  q 


£q  =  (I  -  V  <Dq  9q  +  E) 


(5.5) 


T  -IT 

where  Pq  =  (Dq  Dq)  Dq  is  the  orthogonal  projection  operator  defined  in 


the  previous  chapter.  Thus,  the  mean  value  of  £q  is: 


p  =  E{e  }  =  (I  -  P  )  D  9 

q  q  '  q '  q  q 


_ (5.6) 


Furthermore, 


E{[eq  -  WqJt(£q  -  Uqn  =  ECc^I  -  P  )  e}  =  (N  -  q)  <j\  ....(5.7) 

and  so  subtracting  y  =  pq  from  £q  produces  an  effective  estimate  of  the  t»n'> 
noise  variance. 

For  a  model  of  order  q,  AICq  can  now  be  rewritten  in  terms  of  AICq,  £q, 


and  9q  by  combining  eqns  5.2,  5.3  and  5.6  to  produce: 

AIC  =  N  In  (e*  e  )  +  N  In  (1  -  g  )  +  2q  =  AIC  +  N  In  (1  -  g  )  - (5.8) 

qqq  q  q  q 


where  g 


2  (I  -  P  )  D  9  -  9*  DT  (I  -  P  )  D  6 

n  n  '  n  n  r%  n  '  n  '  n  n 


t 

q  q 


- (5.9a) 
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or,  as  a  consequence  of  eqn  5.5, 


2  £*■  (I  -  P  )  D  0  +  6*  DT  (I  -  P  )  D  0 

'  rx  '  n  n  n  n  '  n  '  n  , 


_q  q 


s'1  e 

q  q 


3—3  ....(5.9b) 


Before  continuing,  the  following  assumption  will  also  be  made: 


Assumption  5.1:  Let  0  denote  the  class  of  models  within  which  the  "best" 
model  exists  and  let  0q  denote  the  vector  of  elements  of  the  true  system 
model  not  included  in  a  model  of  dimension  q  within  0.  Then,  it  will  be 
assumed  that  0  contains  only  those  models  for  which  the  elements  of  0  are 

q 

small  enough  to  ensure  that  3q  (eqn  5.9b)  is  much  less  than  unity.  The 
dimension  of  the  model  of  lowest  dimension  within  0  will  be  denoted  by  q^. 

Assumption  5.1  simply  implies  the  obvious;  namely,  that  the  order  of  the 
"best"  model  cannot  be  arbitrarily  small  since  this  would  lead  to 
unacceptably  large  errors  in  data  fit.  It  also  reiterates  and,  in  effect, 
quantifies  Akaike's  premise  that  the  only  models  being  considered  are  those 
"sufficiently  close"  to  the  true  model.  Using  this  assumption,  ln(l-0q)  can 
be  accurately  approximated  by  the  first  term  in  its  Taylor  series  expansion 
(i.e.  ln(l-3q)  =  -0q),  anc*  the  following  incremental  relationship  between 

AIC  and  AIC  for  model  orders  of  q  and  q+1  can  be  established: 

q  q+1  q  q+1  q  q+1  q' 

=  MICq  -  N  <3q+1  -  0q)  ....(5.10) 

A  more  detailed  investigation  of  the  error  term,  N  (0  ,  -  (3  ) ,  leads  to  the 

°  q+1  q 

following  result: 

Result  5.1:  Consider  the  class  of  models  0  established  by  Assumption  5.1. 
Let  0(q+l)  represent  the  (q+1)  parameter  of  the  true  system  model,  8q+1 
represent  the  vector  of  true  parameters  which  are  neglected  when  a  (q+1 )— 
dimensional  model  within  0  is  used,  and  d^(q)  denote  the  first  column  of 

Dq.  Then,  to  within  the  accuracy  of  the  Taylor  series  approximation  for 
ln(l-3  ),  MIC  may  be  replaced  by: 

q  q 
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MIC(1)  =  MIC  +  E 

q  q  q 


where  Eq  =  (e2(q+l)  dj(q)  [I  -  P  ]  d^q) 


_ (5.11) 


- (5.12) 


c  t 

q  q 


+  {2  0(q+1>  +  S}  d{(q)  [I  -  pq]  Dq+1  eq+1 

2  [I  —  P  I  dj(q)  {0(q+l )  +  6}}, 


6  =  dl(q)  [I  -  Pq]  Dq+1  0q+1  /  y 


y  =  dj(q)  [I  -  P  ]  dx(q) 


Proof:  A  comparison  of  eqns  5.10  and  5.11  indicates  that 

Eq  =  -N  (3q+i  -  3q)  where  0q  is  given  by  eqn  5.9b  and  0q+^  is  given  by: 

Vi  ■  {2et<1  -  Vi>  Vi  Vi  *  »;,1  “q.i'1  -  Vi>  Vi  Vi} 


£q+l  £q+l 


But,  it  can  be  shown  that 


EJe1  ,  t  .}  E{et  £  }  +  pt  ..  p  ,  -  11%  -  _ 

q+1  q+1  1  q  q*  q+1  *q+l  Mq  Mq  e 


p  ,  p  ,  -  p  p  -  a 

q+1  q+1  q  q  E 


=  E{eq  £q}  1  + 


E{£l  £  } 

q  qJ 


For  models  within  0,  the  second  term  in  brackets  on  the  right  hand  side  of 
this  expression  produces  second  order  effects  in  3q+j  which  can  be 
neglected  and,  hence,  3q+j  can  be  accurately  approximated  by: 

/^(I  -  P„  •,)  D  ,9  ,  +  01  ,  DT  ,(I  -  P  ,)  D  1  0 
q+1  „t  „  V  q+1  q+1  q+1  q+1  q+lv  q+1'  q+1  q+1/ 


e  £ 
q  q 


Now,  by  partitioning  D  and  D  .  as  shown  here 

1 


D,  ■  |di«'>  I  D,.i> 


Vl  ■  '°q  1  dl(,)l’ 


the  following  relationships  can  also  be  established: 
2£tlI-Pql  Dq  0q  =  2et[I-Pq]  [d^q)  |  Dq+1J  f  0(q+l)l 


=  2et[I-Pql  dj(q)  0(q+l)  +  2etlI-PqJ  Dq+19q+1  ....(5.13) 
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V  L/ 

q  q 


|I-P,1  Vq  ’  IWl*1’  1 

'dj(q)' 

[I-Pq]  [dj(q) 

i  Vi' 

6(q+l)- 

~T 

0  , 

D  i 

.  q+i  . 

L  q+l 

=  02 ( q  + 1 )  dj(q)  [I-Pq]  dx(q) 

+  2  e«»+1>  l1-^}  Vi  eq+1  .. 

+  "I+l^V  Vl  V 


,.(5.14) 


,1  =  Pq  +  (Pq  dx  d\  Pq  -  dj  dj  Pq  -  Pq  dj  dj  +  dj  dj)  /  Y  ....(5.15) 


q+1  'q 

T  -1 

where  eqn  5.15  is  derived  using  the  relationship  between  (Dq+j  Dq+^)  a°d 
T  -1 

(Dq  Dq )  presented  in  many  system  identification  texts  (e.g.  [EYK1], 
[ STR1 ] ) .  Eqn  5.12  can  now  be  generated  by  substituting  eqns  5.13,  5.14, 
and  5.15  into  the  appropriate  expressions  for  gq  and  0q  +  ^  and  performing 
the  algebra  required  to  compute  Eq  =  -N  (0q  +  ^  -  0  ) . 

_ QED 


Result  5.1  establishes  the  fact  that  AIC,  the  most  practicable  form  of 
AIC,  should  be  corrected  by  including  additional  terms  (defined  by  Eq  in  eqn 
5.12)  to  bring  it  in  line  with  the  assumptions  made  by  Akaike  in  the 
development  of  the  AIC  criterion.  It  should  be  noted  here  that  a  number  of 
proposed,  "AlC-type"  criteria  do  include  some  form  of  correction.  For 
example,  empirical  results  produced  by  Bhansali  and  Downham  [BHA1)  and  based 
on  modifications  of  Akaike's  Final  Prediction  Error  criterion  lead  to  a 
criterion  of  the  form: 

AIC^  =  N  ln(et  e)  +  a  q 

For  a  >  2,  this  modification  corresponds  to  the  addition  of  an  extra 
constant,  (a  -  2),  to  MIC  at  each  increment.  Indeed,  Bhansali  and  Downham 
obtained  best  results  with  a  =  4,  and  further  work  by  Edmunds  [EDM3] 
suggests  that  larger  values  of  a  can  be  used  to  eliminate  the  possibility  of 
overparameterizing  the  model.  As  shown  in  [SCH2J  and  [HAN1J,  other  proposed 
criteria  suggest  additions  to  MIC  which  are  a  function  of  the  number  of 
available  data  samples  so  that,  in  general  (DAV1J,  the  desired  criterion 


* 
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takes  the  form: 


N  ln^e)  *  q  f (N) 

The  "AlC-type"  criteria  identified  above  do  not,  however,  account 
explicitly  for  model  structure  as  required  by  Result  5.1.  In  many 
instances,  this  indirect  approach  can  be  justified.  For  autoregressive  (AR) 
and  mixed  autoregressive/moving  average  (ARMA)  models,  the  error  terms 
0(q+l)  and  cannot  be  systematically  identified.  So  it  is  not  generally 

possible  to  identify  the  proper,  model-dependent  corrections.  In  these 
situations,  the  criteria  mentioned  above  offer  implementable  alternatives  by 
generating  appropriate  approximations  for  the  required  corrections.  But, 
when  the  goal  of  model  selection  is  weighting  sequence  truncation, 
additional  characteristics  of  the  model  class  can  and,  in  fact,  must  be 
considered  to  identify  the  "best"  model.  For  this  problem,  the  use  of 
either  arbitrary  incremental  corrections  or  corrections  based  solely  on  the 
number  of  data  points  produces  arbitrary  truncations;  as  the  size  of  the 
incremental  correction  increases,  the  number  of  terms  in  the  model 
necessarily  decreases.  Hence,  criteria  which  use  a  correction  constant,  a, 
or  a  correction  function,  f(N),  tend  to  generate  truncation  levels  that 
depend  almost  entirely  on  an  apriori  choice  of  the  correction.  Instead,  as 
shown  in  the  following  section,  it  is  possible  to  systematically  generate 
precise  incremental  corrections  for  AIC  using  special  weighting  sequence 
characteristics.  The  result  is  a  new  criterion  which  effectively  balances 
the  improvements  in  data  fit  achieved  by  increasing  model  order  with  )Ik' 
increased  variability  of  the  resulting  parameter  estimates  to  identify  the 
desired  truncation.  At  the  same  time,  this  criterion  generates  addition, >! 
information  on  the  errors  introduced  by  truncation. 

5.2  Optimal  Truncation:  An  Alternative  Parameter-Space  Criterion 

Result  5.1  establishes  an  accurate  relationship  between  MIC  and  MIC. 
Unfortunately,  this  •  relationship  is  a  function  of  several  unknown 
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quantities.  For  weighting  sequences  however,  these  unknown  variables  can  be 
accurately  identified.  As  a  result,  an  alternative  realizeable  criterion 
which  identifies  precisely  the  same  truncated  model  as  the  true  minimum  AIC 
criterion  can  be  generated.  The  development  of  this  criterion  begins  with 
Result  5.2  which  establishes  an  equivalent  expression  for  AIC  in  terms  of 
AIC  and  the  incremental  corrections  defined  by  eqn  5.12.  The  discussion 
then  focuses  specifically  on  weighting  sequence  truncation.  First,  Result 
5.3  refines  the  class  of  weighting  sequence  models  within  which  the  "best" 
model  exists,  and  then  Result  5.4  highlights  additional  adjustments  to  the 
criterion  of  Result  5.2  that  are  unique  to  weighting  sequence  models.  From 
the  insights  obtained  in  Result  5.4,  a  new  truncation  criterion  is  proposed 
in  Result  5.5,  and  this  criterion  is  shown  to  possess  the  same  properties  as 
the  minimum  AIC  criterion.  Finally,  Result  5.6  establishes  the  conditions 
required  to  produce  an  implementable  procedure  for  the  identification  of  the 
"best"  truncation  using  this  new  criterion. 

To  begin,  the  minimum  AIC  criterion  can  be  reformulated  using  Result 
5.1  to  produce: 

Result  5.2:  Consider  the  class  of  models  9  which  satisfy  Assumption  5.1, 

and  let  q  represent  the  dimension  of  any  specified  model  in  the  class. 

For  E  defined  by  eqn  5.12,  define  AIC^  as  follows: 
q  q 

AIC(1)  =  AIC  _ (5.16a) 

q0  q0 

(1)  ‘  q~* 

AIC'  '  =  AIC  +  Z  E.  for  q  >  qn  _ (5.16b) 

q  q  .  l  ^ 

q0 

Then  the  "best"  model  within  0  can  be  identified  by  minimizing  AIC^  over 

q 

all  q  >  qQ. 

Proof:  Consider  the  model  of  dimension  q^  within  9.  Using  eqns  5.2  and 

5.3,  AIC  and  AIC  can  be  related  by: 

q0  q0 
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AIC  =  AIC  + 


- (5.17a) 


where 


2st  y  y*  y 

c0  N  !„  .  1  - 

qoEqo  %>%> 


=  constant, 


Result  5.1  demonstrates  that  MIC  and  MIC'  '  are  equivalent  for  models 
within  8  and,  hence,  AIC  ,  is  equivalent  to 

q^+1 


AIC'  '  =  AIC  +  MIC'  '  =  AIC  +  c„  +  MIC  +  E 

q0+1  q0  q0  q0  0  q0  q0 

=  AIC  .  +  cn  +  E 

q0+1  0  q0 

This  procedure  can  be  repeated  for  each  incremental  change  in  dimension  to 
establish  the  value  of  AIC^^  for  any  q  >  q^  as: 


=  AIC  +  c„  +  £  E. 

q  0  .  l 

1=q0 


- (5.17b) 


By  Definition  5.1,  the  "best"  model  within  8  is  obtained  by 
minimizing  AIC^  over  all  models  in  the  class.  Since  MIC^^  is  equivalent 

to  MIC  for  all  q  >  q^,  AIC^  (as  defined  by  eqns  5.17a,b)  is  equivalent 

to  AICq  for  each  q  >  q^.  Hence,  minimizing  AIC^1^  also  identifies  the 
"best"  model  in  the  class.  Furthermore,  since  the  location  of  the  minimum 
is  unaffected  by  the  addition  of  a  constant,  AIC^^  can  be  simplified  by 
removing  the  constant  cn  to  produce  the  cost  function  described  by  eqns 


5. 16a, b. 


.... QED 


The  order  selection  criterion  defined  by  Result  5.2  still  depends  on 
several  unknown  quantities.  For  weighting  sequences,  important  relation¬ 
ships  between  these  quantities  exist  and  can  be  used  to  transform  the 
results  above  into  a  realizeable  truncation  criterion.  Consider  the  correc¬ 
tion  to  AIC  defined  by  eqn  5.12.  When  weighting  sequence  models  are  used, 

H 

the  terms  d^(q),  Pq,  and  D  ^  are  simply  functions  of  the  inputs  used  during 
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the  identification  test,  while  9(q+l)  and  9q  +  ^  behave  in  a  predictable 
manner  that  can  be  identified  and  used  to  accurately  estimate  these 
quantities  (as  will  be  shown  later).  However,  e  (the  true  noise  in  the 
output  data)  is  random  and  cannot  be  identified.  Since  e  is  uncorrelated 
with  the  true  system  parameters  and  test  inputs,  the  average  value  of 
2et[I-Pq]  d£(q)  {©(q+l)  +  8)  is  zero  and,  at  first  glance,  it  would  seem 
reasonable  to  assume  that  this  term  can  be  neglected  when  calculating  the 
desired  correction,  Eq.  But  for  any  given  identification  test,  this  term  is 
not  identically  zero.  Furthermore,  because  the  elements  of  the  true  weight¬ 
ing  sequence  grow  smaller,  this  term  may  become  a  significant  component  of 
Eq  for  many  models  in  8.  In  these  cases,  the  failure  to  account  for  this 
random  effect  could  produce  an  arbitrary  and  erroneous  selection  of  the 
"best"  model  order.  To  avoid  this  problem,  the  class  of  weighting  sequence 
models  established  by  Assumption  5.1  can  be  restricted  as  follows: 


Result  5.3:  Let  8ys  represent  the  class  of  weighting  sequence  models 

(obtained  using  a  least-squares  procedure)  within  which  the  "best"  model 
exists.  Then,  8yS  is  a  subset  of  9,  and  this  subset  contains  only  models 
which  satisfy  the  following  condition: 

02(q+l)  d}(q)  |I-P  ]  dj(q) 

+  (2  ©(q+l)  +  5)  dj(q)  [I  -  PqJ  Dq+19q+1  ....(5.18) 

»  2  et l I-pq  1  ^(q)  {©(q+l)  +  8) 

The  dimension  of  the  model  of  largest  dimension  within  will  be  denoted 

by  q£  • 


Proof:  Since  €  is  uncorrelated  with  the  true  system  parameters  and  test 
inputs,  2st[I-Pq]  dj ( q )  {9(q+l)  +  8)  represents  only  random  non-zero 
effects  introduced  by  the  use  of  finite  data  sets.  Attempts  to  estimate 
parameters  for  which  this  quantity  is  significant  will,  therefore,  be 
dominated  by  unknown  random  errors  and  the  resulting  parameter  estimates 
will  necessarily  include  large  random  components.  Clearly,  the  "best" 
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system  model  cannot  contain  parameters  for  which  accurate  estimates  are 
unavailable,  yet  these  are  the  only  models  in  0  which  are  excluded  from 
by  condition  5.18.  Hence,  the  "best"  model  must  be  an  element  of  ^S' 

. . . .QED 

It  is  important  to  emphasize  that  Result  5.3  does  not  imply  a  direct 
relationship  between  the  absolute  size  of  the  parameters  that  can  be 
estimated  and  the  size  of  the  noise  present  in  the  data  (as  measured  by  its 

variance).  Indeed,  user-specified  test  conditions  (e.g.  sample  size  and 

system  inputs)  can  be  selected  so  that  weighting  sequence  elements  which  are 
significantly  smaller  than  the  variance  of  the  noise  can  still  be  accurately 
estimated.  Thus,  Result  5.3  simply  refines  the  description  of  the  model 
class  for  a  given  set  of  input/output  data  (as  mentioned  previously)  so  that 
an  appropriate  truncation  criterion  will  not  be  affected  by  random,  and 
possibly  harmful,  effects. 

For  models  which  satisfy  Result  5.3,  2et[I-P^]  d^(q)  (9(q+l)  +  &}  can 
be  neglected  and  the  identification  of  £  is  unnecessary.  As  a  result,  the 
corrections  to  AIC  identified  by  eqn  5.12  for  any  given  model  are  now  simply 

functions  of  the  known  inputs  and  the  unknown  terms  0(q+l)  and  ®q+j*  But 

for  weighting  sequences,  9(q+l)  and  9^+^  can  be  related  to  the  parameters  of 
the  q-dimensional  model  in  the  following  way: 

Definition  5.2:  Let  0(q)  denote  the  q*1^  element  of  the  infinite  weighting 
sequence  for  a  stable  system.  Then,  there  exists  an  element  9(q^)  in  the 
sequence  such  that,  for  all  q  >  q^,  9(q+k)  can  be  bounded  in  size  by  a 
single  exponential  decay 

1 9(q+k)  |  <  pjj  |0*(q)|  Vk  >  0  ....(5.1r>) 

where  0*(q)  lies  on  the  boundary  of  the  exponential  decay  and  is  a 
constant  which  describes  the  rate  of  decay  and  is  defined  by 
9  (q+l)/9  (q)).  Eqn  5.19  defines  the  exponential  bounding  property  for 
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infinite  weighting  sequence  descriptions  of  stable  systems. 

The  use  of  open-loop  identification  techniques  implies  that  the  system  under 
investigation  is  stable,  and  so  the  elements  of  its  weighting  sequence  will 
be  exponentially  bounded.  In  addition,  any  accurate  finite  weighting 
sequence  model  must  necessarily  exclude  only  elements  which  can  be  bounded 
by  this  decay.  Hence,  all  models  in  Qyg  will  contain  elements  that  can  be 
exponentially  bounded.  As  a  result,  the  exponential  decay  property  can  be 
used  to  modify  the  order  selection  criterion  established  by  Result  5.2  as 
shown  here: 


Result  5.4:  Consider  a  system  whose  weighting  sequence  elements,  9(i), 

are  sign  definite  for  qg  <  i  <  q^.  Then  for  any  specified  set  of 
input/output  data  (where  the  inputs  are  generated  as  an  uncorrelated  zero- 

mean  sequence  and  N  >>  q^),  the  "best"  model  order  can  be  identified  by 

(2)  (2) 
minimizing  AIC^  over  all  q  in  the  range  qQ  <  q  <  qf,  where  AIC^  is 

defined  by: 


AIC(2)  =  AIC  + 

q  q 

and  6*(q)  and 


AIC(2)  =  AIC 
q0  q0 

q~^  (  N  ?  ~t  ~  i 

E  \  -A-  \o\  9  (i)  dj(i)  d  (i)] 
i=q0  U.  S.  J 

are  defined  by  Definition  5.2. 


q  >  q0  • 


(5.20a) 

(5.20b) 


Proof:  Since  all  models  in  satisfy  Assumption  5.1,  Result  5.2  is 
valid  and  the  "best"  model  order  can  be  identified  by  selecting  the  model 
which  minimizes  AIC^^  (eqn  5.16)  over  all  q  in  the  range  qg  <  q  <  q^. 
Now  Result  5.3  implies  that  2et[I-P^]  d^(q)  (0(q+l)  +  5}  is  negligible  for 
all  model  orders  qg  <  q  <  q^.  Furthermore,  when  N  >>  q^,  the  use  of 

uncorrelated  inputs  implies  that  the  quantities  d ^ (q )  dj(q)  Pq  Dq+j* 
and  d|(q)  P^  d^(q)  are  also  negligible.  Hence  under  the  assumed 


conditions, 


(eqn 


5.12)  can  be  accurately  replaced  by 


02(q+l)  dj(q)  dj(q),  and  AIC^  is  equivalent  to 
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q-1 

AIC  +  T. 

q 


qE  {  [02(i  +  l)  d^(i)  d.(i)l\;  q  >  q0 

i=qn  {  t.  e.  ) 


over 

the  desired 

range  of  q. 

Since  9(q) 

is  sign  definite  over  the  same 

range, 

Definition 

5.2 

implies 

that  9(q+l) 

may  be  replaced  by  p^9  (q)  and 

aic(2) 

q 

can  be  rewritten 

as : 

aic(2)  = 
q 

AIC  + 

q 

q-1 

£ 

i=q0 

U\'* 

*2  \ 

9  (i)  dj(i)  dx(i) ] V 

Since 

AIC(2)and 

q 

AIC(1) 

q 

are 

equivalent 

for  all  qQ  <  q  <  q^,  minimizing 

aic<2) 

q 

produces 

the  same  result  as  minimizing  AIC^^.  Thus,  by  Result 

(2) 

5.2,  minimizing  AIC^  must  identify  the  "best"  model  within 


. . . .OED 


Result  5.4  demonstrates  that  minimizing  AIC^  ;  over  all  q  in  the  range 
Qq  £  Q  <  identifies  the  truncation  level  which  is  "best"  in  the  statisti¬ 
cal  context  of  the  AIC  criterion  initially  proposed  by  Akaike  (Definition 
5.1).  It  should  be  noted  that  the  criterion  is  still  not  implementable 
since,  for  example,  apriori  information  on  q^  and  q^  is  required.  However, 
Result  5.4  does  simplify  the  criterion  considerably  and  provides  important 

insights  for  its  implementation.  These  insights  can  be  used  to  establish  an 

(2) 

alternative  criterion  which,  though  similar  to  AIC  ,  is  implementable. 

Before  proceeding  to  this  development,  two  additional  points  concerning 
Result  5.4  should  also  be  highlighted.  First,  the  assumption  of 
uncorrelated  zero-mean  inputs  was  introduced  primarily  to  simplify  the 
following  analysis.  In  fact,  the  use  of  this  test  input  guarantees 
equivalence  between  AIC^  and  AIC^  (and  hence  AIC)  since  the  quantity 
{29(q+l)+5}  d j(q) [I-P^ JD^+^9^+ ^  is  negligible  in  this  situation  and  0(q+l) 
★ 

is  equal  to  0  (q).  When  other  inputs  are  used,  Result  5.4  may  be 

j»  ^ 

extended  simply  by  including  this  additional  term  and  using  p*  0  (q)  as  an 


86 


upper  bound  on  the  It  truncation  term  in  the  vector  0  , .  In  this 

q+1 

situation,  AIC^^  establishes  an  accurate  upper  bound  for  AIC^^  and,  in  the 
absence  of  any  additional  information,  it  generates  the  best  available 
estimate  of  AIC.  Second,  the  assumption  that  the  system  weighting  sequence 
elements  are  sign  definite  for  qQ  <  q  <  q^  is  valid  for  all  systems  except 

(2) 

those  with  dominant  complex  poles.  So  the  equivalence  between  AIC^  '  and 

AIC^^  will  hold.  When  the  weighting  sequence  elements  are  oscillatory 

Q 

however,  0(q+l)  is  not  equal  to  p^  0  (q),  but  instead  |0(q+l)|  <  | 0  (q) | . 

(2) 

As  a  result,  the  correction  terms  used  to  transform  AIC  to  AIC  are  upper 

Q  Q 

bounds  on  the  true  corrections  required  to  maintain  the  equivalence  between 

AIC^2^  and  AIC^^.  It  must  be  kept  in  mind,  however,  that  the  value  of 

q  q 

(2)  (2) 

AIC^  for  any  given  q  will  be  compared  with  all  other  values  of  AIC  to 

establish  the  "best"  model  order.  Hence,  the  correction  terms  must  not  only 

provide  an  accurate  description  of  parameter  size,  but  they  must  also 

produce  a  description  which  is  relevant  for  comparative  purposes.  When  this 

2  *2 

perspective  is  taken,  it  is  easy  to  see  that  the  use  of  p^  0  (q)  in  place  of 

2 

0  (q+1)  for  oscillatory  weighting  sequences  accomplishes  both  objective' 

(21 

Thus,  using  AIC  to  identify  the  "best"  truncation  should  also  .  •* 
effectively  for  oscillatory  systems. 

As  mentioned  previously,  a  closer  examination  of  eqns  S .  >a .  n  [• 
valuable  insights  into  the  implementation  of  the  n  i i+i inn. 


x 

d^(q)  is  known  for  any  q,  while  0  (q) 


i s  a  f  un;  t i or 


weighting  sequence  elements.  For  systems  with  a  dot*-; 
•  * 

best  estimate  of  0  (q)  is  simply  the  last  p,n  t,  •  • 

of  dimension  q;  while  for  systems  with  dor  in  in’  •  ;  • 


estimated  on  the  basis  of  the  dominant 


several  parameter  estimates. 


Sim*  K* 


▼ 


i 


within  %S’  the  model  parameter  estimates  are  necessarily  reliable 
representations  of  the  true  parameters,  and  so  they  can  indeed  he  used  to 
identify  9  (q)  accurately.  The  only  remaining  unknown  in  AIC^  is  the 

decay  constant  p^.  At  first  glance,  it  may  seem  strange  to  introduce  this 

2 

2  2  * 

parameter  into  the  criterion.  Since  6  (q+1)  =  9  (q),  Result  5.4  implies 

that  the  appropriate  correction  term  for  AIC^  only  requires  knowledge  of 
2 

9  (i)  for  i  =  q^,  •••  ,  q^  and,  for  models  within  ®WS  ,  accurate  estimates  of 

these  quantities  are  available.  Thus,  it  would  appear  that  these  estimates 

(2) 

may  be  used  to  establish  AIC^  .  However  as  noted  above,  qg  and  q^  are 
unknown.  So,  the  model  class  0y<,  is  not  known  apriori.  If  models  outside 
are  included  in  the  investigation,  the  correction  terms  associated  with 

(2) 

these  models  will  be  incorrect.  As  a  result,  AlC^j  will  not  be  an  accurate 

( 2) 

representation  of  AIC^,  and  the  truncation  which  minimizes  AIC  '  will  not 

necessarily  correspond  to  the  "best"  truncation.  In  many  cases  therefore, 

(2) 

the  indiscriminate  use  of  parameter  estimates  alone  to  calculate  AIC^  7  will 
lead  to  erroneous  truncations. 

The  introduction  of  the  decay  constant  p,  however,  offers  an 
alternative  view  of  the  required  corrections.  In  particular,  each  p.  may  be 
interpreted  as  a  "scale  factor"  for  the  specified  correction.  Vhen  this 
perspective  is  taken,  a  new  truncation  criterion  which  avoids  the  problem 
described  above  can  be  generated.  For  any  specified  model  in  8^g,  the 
exponential  decay  associated  with  this  model  can  be  directly  related  to  the 


"parameter"  bias  of  the  model  defined  by  ,/^q"  9^)  (i.e.  the  size  of  the 
error  introduced  by  truncation).  More  specifically, 

l2 


®q  ®q  *  'q  0*«>  '  "  ’q>  =  Bq 


_ (5.71 ) 


Hence,  p^  quantifies  an  upper  bound  on  "parameter"  bias  for  the  specified 
model.  Eqn  5.21  defines  an  interesting  relationship  between  p^,  9*(q),  and 
B  .  Clearly,  if  9*(q)  and  p  are  known,  B  can  be  calculated.  However,  the 

M  MM 
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A* 


roles  of  Bq  and  pq  may  also  be  interpreted  in  reverse.  If  6  (q)  is  known 
and  Bq  is  specified,  pq  can  be  calculated  to  ensure  that  the  equality  in  eqn 
5.21  is  satisfied.  This  role  reversal  leads  to  the  following  alternative 
truncation  criterion: 


Result  5.5:  Let  q^^  denote  the  (as  yet  unknown)  order  of  the  model 

(2  \  2 

within  atc  which  minimizes  AICV  ,  and  let  B,  . .  represent  the  upper 

q  <<W1> 

bound  on  the  "parameter"  bias  (eqn  5.21)  associated  with  the  model  of 


order  qbest-l?  namely,  let 


< W1)  '  W”  VMbest 


/  a  -  p? 


(Ohn.t-1)' 


Furthermore,  for  any  given  q,  define  the  scale  factor  pq  such  that 

?q  e*(U  /  (i  -  ?q>  ■  »2wl) 


Then,  the  quantity  Tq  defined  by: 


T  =  AIC 

%  q0 


T  =  AIC 

q 


ICq  +  V  |  [p^  ©*(i)  dj(i)  d^i)]^  q  >  q0 

l=qQ  1  ' 


.(5.22a) 


(5.22b) 


will  possess  a  unique  minimum  for  all  q  in  the  range  q^  <  q  <  q^,  and  this 

minimum  will  occur  at  q. 

best 


Proof:  First,  consider  models  of  order  q  within  6yS  where  q  >  qbest-  For 

these  models,  AIC^^  can  be  related  to  AIC^  by: 

^  ^best 


AIC(2)-  AIC(2)  =  AIC  -  AIC 


+  V  {  -A-  Ip?  e*(i)  d5(i)  d.(i)]} 
1  i=Vstleici  > 


By  definition,  AIC 


^  is  a  minimum.  So,  AIC^-  AIC^2^  is  greater 
^best  **  ^best 


than  zero.  As  a  result, 


AIC  -  AIC  > 
**  %est 


_  V  (  *L-  tp2  e*(i)  d!(i)  d.(i)l\  ....(5.23) 

i=c*best  *  *5  J 


Using  eqn  5.22,  T  and  T  can  be  related  by: 
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T  -  T 


=  AIC  -  AIC 


V  {  [P?  0*(i)  dj(i)  d1(i)]} 


qbest  M  Mbest  i=q,  „  v  c. 

^best  1  1 

But  substituting  the  inequality  5.23  into  this  expression  yields: 


T  -  T 


4best 


V  {  ^r-  [Cp\  -  pj)  0*(i)  dj(i)  d1(i) ]) 

i=q.  V.  E.  C.  ) 


*best  i  i 


'2  ..  2 


Since  B.  1S  >  B  for  all  q  >  q.  p  >  p  .  Hence,  T  >  T  for 

(qbesr1)  "  q  best  q  '  q  q  qbest 

a11  q  >  qbesf 

A  similar  approach  can  be  used  to  develop  the  following  relationship 


between  T  and  T 


‘best 


for  any  q  in  the  range  q^  <  q  <  Qjjest* 


T  -  T  >  bT  {  [(pj-  pj)  6*(i)  d}(i)  d1<i) ]} 

q  qbest  i=q  {  e..  e,  1  1  1  J 


l  l 


Since  B. 


for 


<  B  for  all  q  <  q,  p  >  p  .  Hence,  T  >  T 

(qbest-1)  "  q  best  q  q  q  qbest 

all  q  in  the  range  qQ  <  q  <  Qi,est*  Combining  this  result  with  the  result 


for  q  >  q 
models  in 


,  .  establishes  T 

best  q 


best 


as  the  minimum  value  of  T  for  all 


.QED 


Result  5.5  produces  an  alternative  criterion  which  is  equivalent  to  the 

minimum  AIC  '  criterion  of  Result  5.4  and,  hence,  this  new  criterion  must 

also  identify  the  truncation  level  which  is  "best"  in  the  statistical  sense 

associated  with  Akaike's  criterion.  More  importantly,  this  alternative 

criterion  can  be  implemented  using  data  from  the  parameter  estimation 

process  even  though  qQ  and  q^  are  unknown.  So,  it  will  avoid  the  problems 

which  can  produce  an  erroneous  identification  of  the  "best"  truncation  using 
(2) 

the  minimum  AICV  criterion. 

An  appropriate  procedure  for  truncation  level  identification  based  on 

the  cost  function  (eqn  5.22)  depends  on  several  subtle  implications  of 

Result  5.5  which  must  be  highlighted.  First,  the  proof  that  achieves  a 

minimum  at  q.  .  relies  only  on  differences  between  T  and  T  for  q  in 

best  q  qbest 
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r 


the  range  Qq  <  q  <  q^  which  suggests  that  the  relationship  between  and 

T  (over  this  range)  is  independent  of  the  truncation  level  at  which  T 

qbest  q 

is  initialized  provided  this  initial  truncation  level  is  less  than  or  equal 
to  qQ.  As  suggested  previously,  any  accurate  weighting  sequence  model  must 
necessarily  include  elements  which  can  be  exponentially  bounded.  Hence,  by 
Assumption  5.1,  q^  must  be  greater  than  q^  (where  q^  is  defined  by 
Definition  5.2).  So,  q^  provides  an  appropriate  and,  indeed,  easily 
identifiable  truncation  level  at  which  to  initialize  T  .  In  essence,  this 

q 

observation  implies  that  T^  may  be  recomputed  using  eqns  5.22a,b  with  q^ 
replaced  by  q^.  For  these  redefined  values  of  T  ,  Result  5.5  remains  valid 
over  the  interval  qQ  <  q  <  q^  and  will  reach  a  unique  minimum  at  qbest* 
Thus,  an  implementable  procedure  based  on  Result  5.5  need  not  rely  on 
knowledge  of  qQ  to  initialize  the  criterion. 

A  second  implication  of  Result  5.5  is  the  following.  If  QAjc  is  the 

order  of  the  weighting  sequence  model  which  minimizes  AIC^,  then  AIC^  is,  by 

definition,  greater  than  AIC  -  for  all  q  >  q.j_.  Furthermore,  since  T  in 

qAIC  q 

eqn  5.22  is  formed  by  adding  a  non-decreasing  function  to  AIC^,  it  is  now 

obvious  that  T  will  be  greater  than  T  .  for  all  q  >  q A “ _ .  Hence,  T  must 


be  a  minimum  for  some  q  <  Qajq?  and  so  QAjq  establishes  an  easy-to-calculate 
upper  bound  for  qbest- 

Together,  the  observations  above  demonstrate  that  qbest  must 
necessarily  lie  in  the  range  q^  <  q  <  qAj^,.  [Indeed,  additional  refinements 

(using  AIC)  to  reduce  the  number  of  candidates  for  qbest  still  further  rnn 
be  established  as  shown  in  Appendix  5.1.  Though  not  essential  to  tbr 
development  of  an  implementable  ’algorithm  for  truncation  selection,  theso 
additional  refinements  can  be  used  to  reduce  the  computational  complexity  of 
the  resulting  algorithm.)  Once  the  set  of  potential  candidates  for  qbest 

has  been  identified,  the  only  remaining  problem  is  to  select  the  "best" 
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model  from  among  the  members  of  this  set.  A  procedure  to  accomplish  this 
task  is  suggested  by  the  following  result: 


i 


i 

I 

t 

» 


i 

i 

i 


Result  5.6:  Let  Q  define  the  known  set  of  weighting  sequence  model  orders 
which  contains  Q^est  and  let  denote  the  k1*1  element  in  Q.  In  addition, 


define  the  quantity  to  be: 


T  =  AIC  + 

q  q 


c|  1  2 

I  fyU  l ~p\  0*(i)  dj(i)  d1(i)]\  ....(5.24) 

i=q,  U4  t.  ) 


1  1 


where  is  a  scale  factor  computed  using  the  relationship 

2 

pJ  e  (i)  /  (i  -  Pp  =  bz  , 


m  ^  -k 

0  (i)  represents  the  value  of  0  (i)  obtained  from  the  estimated 

‘2 

model  of  dimension  i,  and  B  is  non-negative. 

Then  for  each  q.,  examine  the  set  of  values  T  (calculated  using  eqn  5.24) 
k  q 

‘2  ~2 
over  the  range  q.  <  q  <  qb  and  let  B  (q.  )  denote  the  maximum  value  of  B 
D  —  K.  max  k 

for  which  T  remains  a  minimum  at  q.  .  The  "best"  model  order  can  now  be 
q  k 

"2 

identified  by  selecting  the  largest  value  of  q^  for  which  Bmax(qk)  a 


valid  upper  bound  for  B 


<v»' 


Proof:  For  weighting  sequence  models,  increasing  the  model  order  from  q 

to  q+1  automatically  improves  the  model  fit  to  the  observed  output  data. 
If  the  last  estimated  parameter  in  the  model  of  order  q+1  is  an  accurate 
estimate  of  the  corresponding  true  parameter,  this  improvement  in  fit  has 
clearly  been  attained  by  improving  the  weighting  sequence  description  of 
the  system,  not  by  simply  fitting  to  the  noise  in  the  data.  Hence,  the 
"best"  truncation  must  necessarily  correspond  to  the  largest  model  order 
for  which  the  last  estimated  parameter  is  an  accurate  estimate  of  the 
corresponding  true  parameter.  Thus  by  Result  5.3,  qjjest  =  Qj* 

*  k 

Now  if  q^  =  qbegt,  then  0  (q^)  is,  by  definition,  an  accurate 
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estimate  of  0  (q^).  In  addition,  the  set  of  values  (for  <  q  <  q^) 
will  be  identical  to  the  set  of  values  T  defined  by  eqn  5.22  when 

q 

*22  '  "22 

B  =  B  1.  Hence,  T  (calculated  using  B  =  B  .)  must  be  a  minimum  at 
qk-i  q  qk_i 

*  ~  2  *  2 

q,  .  But  by  assumption,  T  is  a  minimum  at  q.  when  B  =  B  (q.  ),  and  this 

k.  q  R  max  R 

~2  2 
implies  that  B  (q.)  is  greater  than  or  equal  to  B  ..  Thus,  if 

max  k  q^  —  1 

*2  2 

B  (q.  )  is  not  a  valid  upper  bound  for  B  , ,  q.  *  q,  .  Furthermore 
max  K  q.  —  x  r  ues  t 


from  above,  ^ bes t  =  qf*  ®°»  this  result  also  implies  that  qjjest  must 

necessarily  correspond  to  the  largest  model  order  in  the  set  Q  for  which 

"2  2 
B  (q,  )  is  a  valid  upper  bound  for  B  .. 
max  k.  q^-i 

.... QED 


As  demonstrated  by  Result  5.6,  the  problem  of  optimal  weighting  sequence 
truncation  using  the  criterion  of  Result  5.5  reduces  to  a  problem  of 
verifying  the  accuracy  of  the  maximum  bias  bound  established  for  each  model 
order  in  Q.  A  complete  procedure  for  identifying  qjjest  v*a  bias  bound 
verification  is  presented  and  discussed  in  Appendix  5.2.  The  results  of 
several  Monte  Carlo  simulations  are  presented  in  Section  5.4  to  demonstrate 
the  validity  of  this  procedure. 

The  truncation  criterion  established  by  Results  5.5  and  5.6  possesses 
two  particularly  noteworthy  features  which  must  be  stressed.  First,  it 
guarantees  that  the  specified  truncation  point  is  established  in  a  manner 
consistent  with  the  level  of  noise  present  in  the  data  rather  than 
an  arbitrary  specification  of  the  size  of  the  individual  parameters. 
Second,  it  simultaneously  generates  an  accurate  estimate  of  the  "parameter" 
bias  introduced.  Thus,  this  new  criterion  not  only  establishes  a 
statistically  optimal  truncation  for  the  system,  but  it  also  quantifies  the 
affects  of  this  truncation.  Although  the  truncation  effects  are  currently 
quantified  in  terms  of  "parameter"  bias,  it  is  possible  to  extend  the 
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criterion  to  generate  frequency  response  bias  information  as  demonstrated  in 
the  following  section. 


5.3  Optimal  Truncation:  Frequency  Domain  Extensions 

5.3.1  A  Modified  Criterion  to  Identify  Frequency  Response  Bias 

For  infinite  weighting  sequence  models,  the  effects  of  truncation  on 
system  frequency  response  can  be  related  to  the  parameters  of  the  given 
model  using  relationships  defined  in  the  previous  sections.  In  particular, 
the  error  introduced  by  truncation  at  any  specified  z  is  given  by: 

00 

gE(z)  =  L  z  1  '  - (5.25) 

i=q+l 

So,  the  magnitude  of  gR(z)  can  be  bounded  for  all  |z|  =  1  in  the  following 
way: 

Result  5.7:  Consider  a  stable  system  described  by  the  z-domain  transfer 

function 

00 

g(z)  =  E  9(i)  z"1 
i=l 

where  0(i)  now  represents  the  i ^  element  of  the  infinite  weighting 

sequence.  Let  q  represent  the  dimension  of  the  truncated  model  such  that 

q  >  q^  (where  q^  is  defined  by  Definition  5.2).  Then,  the  magnitude  of 

the  frequency  response  bias  introduced  by  this  truncation  is  bounded  over 

all  frequencies  from  0  to  n/T  by: 

2 

|gE(z)  f 2  <  p2q  0*(q)  /  (1  -  pq)2  =  B2R(q)  ....(5.26) 

00  ,  r-> 

Proof:  | g„(z) |  =  |  E  9(i)  z'1  |  <  E  |9(i)|  |z  *| 

i=q+l  i-qfl 

But  for  z  =  exp(j<oT),  |z_1|  =  1  and  so 

CD 

|g  (z)|  <  E  | 9(i) | 

E  i=q+l 

Since  the  weighting  sequence  elements  are  exponentially  bounded, 
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|gE(z)l  <  pq  |9*(q)l  [1  +  pq  +  pq  +  •  •  * )  =  pq  I  ©* ( q )  |  /  (1  -  pq) 
and  eqn  5.26  follows  immediately. 

.... QED 


Result  5.7  establishes  a  single  upper  bound  on  the  frequency  response 

bias  introduced  by  truncation  in  terms  of  the  bounding  exponential  decay 

constant,  p,  and  the  parameters  of  the  specified  finite  weighting  sequence 

model.  But  this  is  precisely  the  same  information  required  to  define  the 

"parameter"  bias  bound  used  to  derive  Result  5.5.  Indeed,  a  comparison  of 

2  2 

eqns  5.21  and  5.26  indicates  that  and  BpR(q)  are  related  by: 

BRR(q)  =  Bq  (1  +  P  )  /  (1  -  p  ),  ....(5.27) 

and  this  relationship  suggests  that  the  optimal  truncation  criterion  can  be 
2 

redefined  using  BRR(q)  as  shown  here: 


Result  5.8:  Let  q^est  denote  the  (as  yet  unknown)  order  of  the  model 

(2)  2 

wi thin  ®WS  which  minimizes  AIC^  ,  and  let  BpR(q(jest~l)  denote  the  upper 
bound  on  frequency  response  bias  (defined  by  eqn  5.26)  associated  with  the 


model  of  order  q^est-^  *n  addition,  define  the  scale  factor  PpR(q)  (for 
any  given  q)  such  that 

2 

PpR(q)  e*(£»>  '  l1  -  PFR(q)]2  =  BFR(qbest_1) 

Then,  the  quantity  TRR(q)  defined  (for  q  >  q^)  by: 

q-1  ~~  ,2 

TpR(q)  =  AIC  +  E  [  p‘R(i)  0  (i)  d{(i)  dj(i)]  ....(5.28) 

i=qb  C.  e. 

will  possess  a  unique  minimum  for  all  q  in  the  range  q^  <  q  <  q^,  and  this 
minimum  will  occur  at  q^es t * 


2  2 

Proof:  Since  BRR(q)  and  B^  are  related  by  eqn  5.27,  the  relatinn^hi p 

used  to  prove  Result  5.5  are  also  valid  for  TRR(q).  Thus,  the  proof  here 
is  identical  to  that  of  Result  5.5. 

.... QED 
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The  similarities  between  Results  5.5  and  5.8  can  also  be  used  to  extend 
Result  5.6  to  the  frequency  doma:n  as  shown  here: 

Result  5.9:  Let  Q  define  the  known  set  of  weighting  sequence  model  orders 
which  contains  qbggt  and  let  q^  denote  the  k*^  element  in  Q.  In  addition, 

define  the  quantity  TpR(q)  to  be: 


TFR(q>  .  #ICq  . 


cj — 1  2 

1  l~pFR(i)  ®*(i)  <*l(i)  ....(5.29) 

i=qb  Uj  e.  ) 


where  pRR(i)  is  a  scale  factor  computed  using  the  relationship 

2 

«&<*>  e*(i)  '  {1  -  PFR<i)]2  =  B^r  , 

* 

9  (i)  represents  the  value  of  9  (i)  obtained  from  the  estimated 

*  2 

model  of  dimension  i,  and  Bro  is  non-negative. 

r  K 

Then  for  each  qR,  examine  the  set  of  values  TRR(q)  (calculated  using  eqn 

5.29)  over  the  range  qb  <  q  <  q^  and  let  BRR  (q^)  denote  the  maximum 

max 

'2 

value  of  BRR  for  which  TRR(q)  remains  a  minimum  at  q^.  The  "best"  model 

order  can  now  be  identified  by  selecting  the  largest  value  of  qR  for  which 

■  2  2 

BpR  (q^)  is  a  valid  upper  bound  for  BpR(q^-l). 

max 


Proof:  With  appropriate  substitutions,  the  proof  is  identical  to  that  of 


Result  5.6. 


. . . .QED 


2  2 

It  should  be  noted  that,  since  BpR(q)  is  proportional  to  B^,  the  truncation 
established  by  the  modified  criterion  of  Results  5.8  and  5.9  must 
necessarily  be  identical  to  the  truncation  established  using  Results  5.5  and 
5.6.  The  only  difference  between  these  two  criteria  is  the  interpretation 
given  to  the  maximum  bias  bounds  generated  by  the  procedures.  When  the 
frequency  response  criterion  is  used,  BpR  (qbggt)  is  a  vali<i  uPPer  bound 


for  BpR^best-^  ant*’  ^  Result  5.7,  it  is  also  a  valid  upper  bound  on  the 


I 
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magnitude  of  the  frequency  response  bias  introduced  by  truncation  for  all 
frequencies  from  0  to  n/T.  Monte  Carlo  simulations  were  used  to  demonstrate 
this  frequency-response-based  criterion,  and  the  results  presented  in 
Section  5.4  for  these  simulations  clearly  demonstrate  the  validity  of  this 
modified  procedure. 

5.3.2  Additional  Refinements  for  Selected  Individual  Frequencies 

2 

For  many  systems,  the  use  of  a  single  quantity  (such  as  BRR(q)  in 
Result  5.7)  to  bound  frequency  response  bias  produces  an  unnecessarily 
conservative  description  of  bias  over  a  wide  range  of  frequencies.  In  these 
situations,  further  frequency  response  modifications  can  be  introduced  to 
refine  the  bias  description  by  establishing  frequency-dependent  bias  bounds 
which  produce  a  more  representative  description  at  specified  individual 
frequencies. 

Consider  a  first-order  system.  For  this  system,  the  error  term 
introduced  by  truncation  is  given  exactly  by 

gE<z)  =  gq+1  z  Ml  +  pz  ♦  P  z  +’••]=  gq  +  1  z  H/  (z-p) 

and  the  frequency  response  bias  introduced  by  truncation  can  be  evaluated  at 
any  individual  frequency,  coT,  simply  by  substituting  z  =  exp(jwT)  into  this 
expression.  Thus,  a  distinct  bias  bound  can  be  established  at  each 
individual  frequency  using  the  relationship: 

Ip  gq  z“V  /  |  z  —  p  | 2  =  P2  gq  /  I  z-  p  |  2  =  BpR(z;q)  - (5.30) 

In  fact,  eqn  5.30  provides  a  valid  and  much  improved  bias  bound  for  any 
system  whose  truncated  weighting  sequence  elements  are  described  by  a  single 
dominant  exponential  decay. 

Equation  5.30  also  provides  a  second  alternative  relationship  between 
the  exponential  decay  constant,  p,  and  truncation  bias  which  can  be  used  to 
redefine  the  criterion  in  Results  5.5  and  5.6  as  follows: 
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Result  5.10:  Consider  the  class  of  systems  whose  weighting  sequence 
elements,  ©( i ) ,  are  accurately  described  by  a  single  exponential  decay  for 
all  i  >  q^.  Let  Q^est  denote  (as  yet  unknown)  order  of  the  model 

(2)  2 

within  this  class  which  minimizes  AIC^  ,  and  let  BpR(z;qbest-l)  represent 
the  upper  bound  on  frequency  response  bias  (defined  by  eqn  5.30)  at 
z  =  exp(j&>T)  associated  with  the  model  of  order  qbest_l‘  For  any  given  q, 


define  the  scale  factor  ppR(z;q)  so  that 

2 

PpR(z;q)  0*(q)  /  \z  -  ppR(z;q)|2  =  BpR(z;qbest 
Then,  the  quantity  TpR(z;q)  defined  (for  q  >  q^)  by: 


-1) 


TpR(z;q)  =  AIC  +  V  {^-  PpR(z?i)  9*(i)  dj(i)  dj(i)} 


(5.31) 


r=q. 


e.  e. 


*b  'i  i 

will  possess  a  unique  minimum  for  all  q  in  the  range  q^  <  q  <  q^,  and  this 

minimum  will  occur  at  q, 

Mbest . 

2  2 

Proof:  For  systems  in  the  specified  class,  BpR(z;q)  is  proportional  to 
(eqn  5.21).  As  a  result,  the  relationships  used  to  prove  Result  5.5  are 
also  valid  for  TpR(z;q).  Thus,  the  proof  here  is  identical  to  that  of 


Result  5.5. 


. . . .QED 


Again,  similarities  between  Results  5.5  and  5.10  can  be  used  to  extend 
Result  5.6  to  this  new  situation  in  the  following  manner: 


Result  5.11:  Let  Q  define  the  known  set  of  weighting  sequence  model 
orders  which  contains  t  and  let  q^  denote  the  k**1  element  in  Q.  For  a 


specified  z  =  exp(jwT),  define  the  quan  tity  TpR(z;q)  to  be: 


T  (z;q)  =  AIC  ♦  V  |pp  (z:i)  0*(i)  dj(i)  <Mi)l}  ....(5.33) 

q  i=qb  c .  ) 


i  i 


where  ppR(z;i)  is  a  scale  factor  computed  using  the  relationship 


-*2 


Ppn (z;i)  6*(i)  /  |z  -  ppR(z;i)|2  =  B2  (z)  , 
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★  .  * 

9  (l)  represents  the  value  of  9  (i)  obtained  from  the  estimated 

■  2 

model  of  dimension  i,  and  B__(z)  is  non-negative. 

r  K 


Then  for  each  q^,  examine  the  set  of  values  TRR(z;q)  (calculated  using  eqn 

-  2 

5.32)  over  the  range  q^  <  q  <  q^  and  let  BRR  (z;q^)  denote  the  maximum 

max 

-  2 

value  of  BRR(z)  for  which  TRR(z;q)  remains  a  minimum  at  q^.  The  "best" 
model  order  can  now  be  identified  by  selecting  the  largest  value  of  q^  for 


*  2  2 
which  BpR  (z;q^)  is  a  valid  upper  bound  for  BpR ( z ; q k -  1 ) . 

max 


Proof:  With  appropriate  substitutions,  the  proof  is  identical  to  that  of 

Result  5.6.  ... .QED 


The  proportional  relationship  between  B2  (z;q)  and  B2  also  implies  that  the 

r  K  CJ 

truncation  established  by  Results  5.10  and  5.11  will  be  identical  to  the 

optimal  truncation  established  using  Results  5.5  and  5.6.  However  when 

Results  5.10  and  5.11  are  implemented  tor  a  particular  value  of 

•  2 

z  =  exp(jwT),  the  maximum  bias  bound,  BRR  (ztq^^),  may  interPrete^  as 

max 

a  valid  upper  bound  on  the  frequency  response  bias  at  the  specified 

frequency,  wT.  In  general,  this  quantity  will  represent  a  tighter  bias 

bound  at  <oT  than  the  one  established  by  Results  5.8  and  5.9.  Furthermore, 
‘2 

BpR  (z;(lbest^  can  ke  usec*  as  an  upper  bound  on  the  frequency  response  bias 
at  all  higher  frequencies  as  demonstrated  by  the  following  result: 

Result  5.12:  Consider  the  class  of  systems  identified  in  Result  5.10. 

2 

Let  BpR(ZQ?q)  denote  the  frequency  response  bias  introduced  by  truncation 
for  a  specified  frequency,  wT  -  w^T;  namely  let 

I gEl exp(  j <*>0T) )  |  2  .  p2  g2  /  |  z()  p|2  =  B2R(z0;q) 

Then  for  any  frequency  wT  >  WqT, 

|gE[exp( jwT) 1 | 2  <  B2R(z0;q)  ....(5.33) 
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Proof:  For  systems  in  the  specified  class,  the  error  term  introduced  by 

truncation  can  be  written  as: 

gE(z)  =  P  gq  z_q/  (z  -  p) 

After  substituting  z  =  exp(jwT)  into  this  expression,  the  following  result 
can  be  obtained: 

Igglexpf jwT) ] | 2  =  p2  g2  /  (1  -  2 p  cos ( «T )  +  p2)  =  B2R(z;q) 

For  wT  >  <OqT,  cos(<oT)  <  cos(WqT)  and  so, 

|gE[exp(jo)T)]  |2  <  p2  g2  /  (1  -  2 p  cos(wQT)  +  p2)  =  B2R(zQ;q) 

.... QED 

Based  on  the  set  of  results  established  above,  the  conservatism  in  the 
single  frequency  response  bias  bound  of  Section  5.3.1  may  now  be  reduced  at 
individually  specified  frequencies  and,  in  many  cases,  the  resulting 
improvements  can  be  significant.  In  addition,  the  frequency-dependent 
procedure  suggested  by  Result  5.11  need  not  be  repeated  at  all  frequencies 
since  the  bound  established  at  any  given  frequency  establishes  an  upper 
bound  on  the  bias  at  all  higher  frequencies  as  well.  Finally,  although  this 
frequency-dependent  extension  is  strictly  valid  only  for  systems  with  a 
single  dominant  real  pole,  the  results  will,  in  practice,  generate  accurate 
bias  bounds  for  a  much  wider  range  of  systems. 

To  verify  this  assertion,  consider  the  system  described  by  the  transfer 
function 

g(z)  =  K/a2  {1  +  (p  lnp)  (z-l)/(z-p)2  -  (z-l)/(z-p)} 

where  p  =  exp(-aT).  [This  transfer  function  is  the  zero-order-hold 

2  -1 
equivalent  of  g(s)  =  K/(s+a)  .]  Expanding  g(z)  in  powers  of  z  and 

isolating  the  terms  from  (q+1)  to  infinity  yields; 

00  CD 

gE(z)  =  (K/a2)  z_1( (1-p)  E  p1  z"1  +  (p  lnp)  E  { (i+1) p1z-1-  ip1_1z_1)J 

i=q  i=q 

Alternatively,  the  infinite  summations  can  be  eliminated  to  produce: 


Eqn  5.34  can  now  be  used  to  investigate  the  inaccuracies  introduced  by  the 
assumption  of  a  single  exponential  decay  at  any  specified  frequency. 

In  particular,  when  |g^/gEl  is  less  than  unity,  the  bias  bound  defined 
by  eqn  5.30  will  not  represent  an  upper  bound  on  the  true  frequency  response 
bias.  Unfortunately,  ISj/Sg!  is  a  function  of  the  number  of  parameters  in 
the  model  (q)  and  the  decay  constant  (p)  as  well  as  frequency.  As  a  result, 
the  inaccuracies  introduced  by  the  single  decay  assumption  for  each  given 
system  and  set  of  input/output  data  will  be  different  and,  so,  it  is  not 
possible  to  quantify  these  inaccuracies  precisely.  General  characteristics 
can,  however,  be  highlighted  by  examining  Ig^/ggl  for  selected  values  of  q 
and  p.  A  closer  examination  of  eqn  5.34  indicates  that,  for  any  given 
frequency  and  specified  value  of  p,  Ig^ggl  is  minimized  when  the  dimension 
of  the  truncated  model  is  just  large  enough  to  ensure  that  p^  <  1. 
Furthermore,  as  q  increases,  the  minimum  value  of  Ig^/ggl  at  each  frequency 
also  increases.  These  relationships  are  clearly  demonstrated  by  the  results 
in  Table  5.1.  For  each  value  of  p  in  this  table,  the  first  row  displays 
information  for  the  model  of  lowest  order  whose  last  element  can  be 


101 


exponentially  bounded,  while  the  second  row  summarizes  the  same  information 
for  a  larger,  more  realistic  model  order.  These  results  clearly  indicate 
that,  in  the  most  pessimistic  situation  where  q  is  just  large  enough  to 
establish  an  exponential  bound,  the  single  decay  assumption  leads  to  bias 
•  bound  errors  of  less  than  15%  over  the  entire  frequency  range  and  even  these 

errors  occur  only  over  limited  ranges.  More  importantly  however,  the  errors 
introduced  by  the  single  decay  assumption  are  negligible  for  more 
^  representative  model  sizes.  A  similar  analysis  has  also  been  performed  for 

^  systems  described  by  a  single  pair  of  complex  conjugate  poles.  Although  the 

details  of  the  analysis  are  not  presented  here,  the  results  indicate  that 
the  use  of  a  single  bounding  exponential  decay  will  still  generate  accurate 
frequency  response  bias  bounds  for  frequencies  sufficiently  far  from  the 
resonant  frequency. 

The  information  presented  above  suggests  that,  although  Results  5.10 
and  5.11  were  derived  for  the  case  of  a  single  dominant  real  pole,  they  can 
be  applied  to  a  much  wider  class  of  systems  to  establish  valid  and  much 
improved  frequency-dependent  bias  bounds.  A  particularly  important 


q 

^  ®b^®E  ^  min 

(«T)  . 
min 

[degj 

frequencies  (wT)  between  which 
|gb/gEl  is  less  than  unity 

4 

.917 

40 

(9,10.5) 

(30,49) 

(176,180) 

10 

.987 

41 

(38,44) 

(178,180) 

7 

.946 

30 

(5,9) 

(22,38) 

(178,180) 

15 

.988 

31 

(28,35) 

(179,180) 

20 

.846 

1.5 

(1,5) 

(10,23) 

(179,180) 

30 

.986 

2.5 

(2,3) 

(14,22) 

(179.5,180) 

Table  5.1:  Worst-Case  and  Representative  Bounds  for  Ig^/ggl 


characteristic  of  the  resulting  criterion  is  the  simplicity  with  which  these 
improved  bias  bounds  can  be  generated.  If,  however,  one  is  concerned  with 
generating  tighter  bias  bounds  near  a  resonant  frequency,  it  should  be 
possible  to  identify  (much  more  complex)  bias  relationships  which  can  be 
employed  in  a  criterion  similar  to  that  established  by  Results  5.10  and  5.11 
to  accomplish  this  task. 


1.  Generate  uncorrelated  zero-mean  normlly-distributed  input  and 
measurement  noise  sequences  with  specified  variances. 

2.  Generate  system  output  data  (1000  samples)  using  the  difference 
equation  implied  by  the  given  g(z)  and  the  input  and  noise  sequences 
identified  in  (1). 
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3.  Use  a  parameter-recursive  least-squares  procedure  to  estimate  the 
parameters  of  each  weighting  sequence  model  up  to  the  model  of  ord»r 

qAIC" 

4.  Implement  the  refinement  procedure  described  in  Appendix  5.1  to 
refine  the  set  of  candidate  models. 

5.  Implement  the  6-step  truncation  algorithm  in  Appendix  5.2  to  establish 
the  optimal  truncation  and  corresponding  bias  bounds. 


Upon  completing  each  data  run,  the  following  information  was  recorded:  the 

optimal  truncation  level  )  identified  by  the  procedure,  the  percentage 

difference  between  the  estimated  residuals  (s1  t  )  and  the  true  residuals 

q  q 


(et[I-P^]  e)  for  q  =  t ’  anc*  t^ie  est^mate^  values  of  "parameter"  bias 

(eqn  5.21)  and  frequency  response  bias  (at  «T  =  0°  and  wT  =  70°)  together 
with  the  corresponding  true  biases  for  q  =  q^est-  This  information  is 
presented  in  Tables  5.2  through  5.6  and  summarized  below. 

Table  5.2  identifies  the  50-run  average  value  of  qbest  for  each  of  the 
12  test  combinations.  As  mentioned  previously,  test  conditions  can  be 
selected  to  accurately  identify  any  number  of  weighting  sequence  elements. 
Indeed  for  fixed  N,  increasing  the  input  signal-to-noise  ratio  should 
increase  the  number  of  elements  that  can  be  accurately  estimated.  The 
results  in  Table  5.2  clearly  indicate  that  this  does  happen  when  using  the 
proposed  algorithm. 

Table  5.3  displays  the  average  difference  between  the  estimated  and 
true  residuals  for  the  specified  truncation.  This  difference  indicates  the 
size  of  Pq  for  the  given  truncation  and,  as  required  to  validate  Assumption 
5.1,  these  values  must  be  significantly  less  than  1.  As  shown  in  the  taMf. 
the  average  value  of  0^  was  less  than  0.075  for  each  test  combination;  a 
result  which  clearly  demonstrates  that  the  optimal  truncation  specified 
using  the  proposed  algorithm  was  established  at  a  point  where  Assumption  5.1 
is  valid. 
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Tables  5.4  through  5.6  present  the  bias  information  generated  by  the 
procedure.  Table  5.4  presents  the  "parameter"  bias  information  for  ea<-h 
test  combination,  while  Tables  5.5  and  5.6  present  the  frequency  response 
bias  information  at  <oT  =  0°  and  wT  =  70°  respectively.  In  each  case,  the 
information  presented  includes: 

1.  the  average  difference  between  the  estimated  bias  bound  and  the  true 
bias  for  q  =  q^; 

2.  the  standard  deviation  of  the  difference  in  (1)  for  the  50  runs 
collected; 

3.  the  average  relative  difference  between  the  estimated  bias  bound  and 
the  true  bias  measured  with  respect  to  the  true  bias;  and 

4.  the  average  relative  difference  between  the  estimated  bias  bound  and 
the  true  bias  measured  with  respect  to  either  the  actual  size  of  the 
true  weighting  sequence  (for  "parameter"  bias)  or  the  actual  size  of 
the  true  frequency  response  (for  frequency  response  bias). 

The  results  in  these  tables  clearly  demonstrate  the  accuracy  of  the  bias 
bounds  generated  by  the  algorithm.  The  average  difference  between  the 
estimated  and  true  frequency  response  biases  at  wT  =  0°  never  exceeded  1.5% 
of  the  true  DC  gain  and,  in  most  instances,  was  less  than  0.5%.  Furthermore 
at  wT  =  70°,  the  estimated  biases  established  tight  upper  bounds  on  the  true 
bias  even  for  systems  with  closely  spaced  real  poles  or  double  poles;  thus 
demonstrating  that  valid  and  significant  improvements  can  be  achieved  using 
the  frequency-dependent  criterion  of  Results  5.10  and  5.11.  Indeed,  from 
the  set  of  Monte  Carlo  results  presented  here,  it  is  clear  that  the  proposed 
truncation  criteria  and  corresponding  algorithms  can  be  used  not  only  to 
identify  the  optimal  truncation,  but  also  to  accurately  describe  the 


associated  bias. 


Table  5.2: 

Average  Model 

Order  of  Optimal 

Truncation 

2,  2 

a  /  a 

8 

16 

64 

u  e 

System 

Average  Order 

17.7 

20.2 

24.0 

1 

Std  Deviation 

1.9 

2.2 

1.8 

Average  Order 

26.6 

29.5 

34.1 

2 

Std  Deviation 

1.9 

2.5 

1.8 

Average  Order 

20.5 

22.6 

27.5 

3 

Std  Deviation 

2.1 

1.9 

1.7 

Average  Order 

17.7 

19.7 

23.8 

4 

Std  Deviation 

1.7 

1.6 

1.7 

Table  5.3:  Residual  Information  for  Optimal  Truncation 


System 

2.  2 

c  t  a 
u  e 

8 

16 

64 

Average  Relative  Error 
(vrt  e^I-PJ  e) 

.053 

.049 

.056 

1 

Std  Deviation 

.035 

.035 

.035 

Average  Relative  Error 
(vrt  e11  ( I-P ]  e) 

.065 

.061 

.072 

2 

Std  Deviation 

.033 

.031 

.042 

Average  Relative  Error 
(vrt  e^I-PJ  t) 

.056 

.058 

.052 

3 

Std  Deviation 

.035 

.050 

.026 

Average  Relative  Error 
(vrt  c 1 ( I -P J  e) 

.048 

.051 

.055 

4 

Std  Deviation 

.027 

.025 

.038 
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Table  5.4:  "Parameter" 

Bias  for 

Optimal  Truncation 

System 

2,  2 

a  /  a 
u  e 

8 

16 

64 

Avg  Absolute  Error 

<W  •'  '9‘  e> 

.021 

.012 

.004 

1 

Std  Deviation 

.038 

.024 

.010 

Avg  Relative  Error 
(wrt  /  6t  0) 

.247 

.207 

.155 

Avg  Relative  Error 
(vrt  /  0l  0) 

.0147 

.0087 

.0028 

Avg  Absolute  Error 

(B  -  /  0C  0) 

max  ’ 

-  003 

.003 

.002 

2 

Std  Deviation 

.030 

.027 

.014 

Avg  Relative  Error 
(vrt  /  0f  0) 

-.010 

.066 

.075 

Avg  Relative  Error 
(vrt  /  0f  0) 

-.0027 

.0029 

.0024 

Avg  Absolute  Error 

(B  -  /  0l  0) 

max  ' 

.010 

-.0002 

.004 

3 

Std  Deviation 

.035 

.018 

.012 

Avg  Relative  Error 
(vrt  /  0l  9) 

.132 

-.010 

.155 

Avg  Relative  Error 
(vrt  /  0*  0) 

.0084 

-.0001 

.0032 

Avg  Absolute  Error 

(B  -  /  9) 

max  ’ 

.017 

.011 

.004 

4 

Std  Deviation 

.033 

.020 

.012 

Avg  Relative  Error 
(vrt  /  01  9) 

.203 

.195 

.156 

Avg  Relative  Error 
(vrt  /  0l  0) 

.0123 

.0080 

.0028 
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Table  5.5:  Frequency  Response 

Bias  for 

Optimal  Truncation  (wT 

=  0°) 

System 

2.  2 

a  /  a  = 
u  e 

8 

16 

64 

Avg  Absolute  Error 

.046 

.023 

-.003 

1 

(Bmax~  ISE1* 

Std  Deviation 

.193 

.116 

.049 

Avg  Relative  Error 
(wrt  |gE|) 

.139 

.084 

-.005 

Avg  Relative  Error 
(wrt  |g|) 

.0091 

.0046 

-.0041 

Avg  Absolute  Error 

-.070 

-.024 

-.010 

2 

(Bmax'  lgE^ 

Std  Deviation 

.161 

.152 

.073 

Avg  Relative  Error 
(wrt  |gE|) 

-.176 

-.091 

-.082 

Avg  Relative  Error 
(wrt  |g|) 

-.0140 

-.0048 

-.0020 

Avg  Absolute  Error 

-.003 

-.040 

.0003 

3 

(Bmax~  lgEl:) 

Std  Deviation 

.178 

.085 

.061 

Avg  Relative  Error 
(wrt  |gE|) 

-.008 

-.169 

.024 

Avg  Relative  Error 
(wrt  |g|) 

-.0007 

-.0081 

.0001 

Avg  Absolute  Error 

.017 

.009 

-.005 

(Bmax" 

Std  Deviation 

.157 

.090 

.054 

4 

Avg  Relative  Error 
(wrt  |gE|) 

.050 

.054 

-.019 

Avg  Relative  Error 
(wrt  |g|) 

.0035 

.0018 

-.0010 
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Table  5.6:  Frequency  Response  Bias  for  Optimal  Truncation  (coT  =  70°) 


System 

2.  2 

a  /  o  = 
u  £ 

8 

16 

64 

> 

Avg  Absolute  Error 

.014 

.009 

.004 

> 

<W  I*EI> 

* 

1 

Std  Deviation 

.011 

.007 

.003 

1 

Avg  Relative  Error 
(wrt  |gE|) 

.382 

.359 

.305 

Avg  Relative  Error 
(wrt  | g | ) 

.0200 

.0128 

.0060 

: 

Avg  Absolute  Error 

.008 

.007 

.004 

<Bmax-  l*El> 

Std  Deviation 

.009 

.006 

.004 

2 

Avg  Relative  Error 
(wrt  |gE|) 

.214 

.291 

.250 

Avg  Relative  Error 
(wrt  | g | ) 

.0988 

.0881 

.0449 

Avg  Absolute  Error 

.012 

.005 

.004 

<Bmax-  'SE15 

Std  Deviation 

.012 

.005 

.004 

3 

Avg  Relative  Error 
(wrt  |gE|) 

.330 

.195 

.312 

Avg  Relative  Error 
(wrt  | g | ) 

.0028 

.0118 

.0089 

Avg  Absolute  Error 

.013 

.009 

.004 

<B„X-  lsEl> 

Std  Deviation 

.011 

.007 

.003 

4 

Avg  Relative  Error 
(wrt  |gE|) 

.316 

.314 

.307 

Avg  Relative  Error 
(wrt  |g|) 

.0303 

.0204 

.0095 
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Appendix  5.1  Refinements  to  the  Set  Containing  q^ 

As  indicated  in  Section  5.2,  the  optimal  truncation  level  (Q^est)  must 
necessarily  lie  in  the  range  q^<  q  <  qAjC-  It  is,  however,  possible  to 
refine  this  set  of  candidates  for  q^  still  further  using  the  values  of 
AIC  (calculated  per  eqn  5.3).  In  particular,  Result  5.5  demonstrates  that 

M 

T  is  the  minimum  value  of  T  for  all  q  in  the  range  q~  <  q  <  q,.  But 

“best  q  0  '  "  f 

is  related  to  AIC^  by  eqn  5.22  and,  for  q  <  q^est>  the  following 
relationship  can  be  established: 


T  -  T 


=  AIC  -  AIC 


*best 


^best 


(  N  ~2  ~t  ~ 

1  \  T"-  [pi  9  (i)  dl(i)  di<i)U 

i=q  l  e.  e.  J 


>  0 


Since  the  summation  in  the  above  expression  is  always  positive,  AIC^  must 

necessarily  be  greater  than  AIC  for  all  q  <  q,  .  Hence,  if  (for  any 

qbest  es 

given  q^  in  the  range  q^  <  q^  <  q^j^.)  AIC^  not  a  minimum  over  the  range 

q^<  <1  <  Hjt  then  q^  cannot  be  equal  to  q^^  and  the  truncation  level 
defined  by  q^  can  be  dropped  from  further  consideration.  This  procedure  can 
be  used  for  each  value  of  q  in  the  range  q^  <  q  <  q^j^,  to  reduce  the  number 


of  possible  candidates  for  q 


best' 


Appendix  5.2:  A  Complete  Procedure  for  Optimal  Weighting  Sequence  Truncation 
The  optimal  truncation  criterion  defined  by  Results  5.5  and  5.6  relies 
on  the  ability  to  generate  an  estimate  for  the  upper  bound  on  bias 
associated  with  the  specified  truncation  level  as  well  as  the  ability  to 
validate  this  bias  bound.  A  complete  procedure  to  implement  the  criterion 
is  described  here.  To  begin,  a  least-squares  estimation  algorithm  that  is 
recursive  in  the  number  of  parameters  | EYK1 ,  STR1 |  can  be  used  to  generate 
the  weighting  sequence  parameter  estimates  for  all  model  orders  from  to 
qAIC’  desired,  the  procedure  discussed  in  Appendix  5.1  can  then  be 

implemented  to  refine  the  set  of  model  orders,  Q,  which  contains 
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After  the  appropriate  set  has  been  identified,  the  following  6-step 
procedure  can  be  used  to  identify  tho  optimal  truncation: 


Step  1:  Procedure  Initialization 

Procedure:  Arrange  the  elements  of  Q  in  descending  order, 

qi  >  $2  >  •••  >  qn»  and  let  qst=  q^* 


Step  2:  Bias  Initialization/Identification  of  Candidate  Model 
Procedure:  Initialize  the  "parameter"  bias  estimate  to 

‘2  ‘*2 
B  =  Qq  <qst> 


. . . . (A5. 1) 


Using  this  value  of  B  ,  calculate  (eqn  5.24)  for  all  q  <  qst  and 

identify  the  model  order,  qnev>  which  minimizes  over  this  range.  If 

q  =  q  proceed  to  Step  3.  If  q  <  q  .  set  q  „  equal  to  the  next 
Mnew  Mst  v  *  uiew  Mst  Mst  M 

element  in  Q  and  return  to  the  start  of  Step  2. 

Comment:  The  true  bias  bound  for  the  model  of  order  q  -1  satisfies  the 

st 

relationship 

co  2 

0  * 

B,  n  >  l  0  (i) 

1  qst 

“  ★  k  "2 

Hence,  when  8  (q  )  an  accurate  estimate  of  0  (qgt),  B  (defined  by 
qst  S 

2 

eqn  A5.1)  will  necessarily  be  less  than  or  equal  to  B.  ...  Now  if 

'qst  ' 

2 

-  2  k*~  "  9 

q  #  q  .,  then  B  (q  „)  <  9  (q  „)•  So,  B  (q  .)  is  not  a  valid 

Mnew  Mst’  maxVMst  Mst'  ’  max  Mst' 

2 

upper  bound  for  B.  j.  and,  by  Result  5.6,  q  cannot  be  q^est*  The 
'qst  '  s 

search  for  q^es t  must,  therefore,  proceed  to  the  next  element  of  0. 

Step  3:  Bias  Identification  for  Candidate  Model 

*  2 

Procedure:  With  B  in  eqn  A5.1  established  as  a  lower  bound  for  the  bias, 

*2 

iteratively  update  the  bias  to  identify  the  largest  value,  B  (q  .), 

lUaX  S  l 

for  which  T  remains  a  minimum.  At  the  same  time,  identify  the 
^st 
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*  2 

dimension  of  the  new  model  (q  )  which  minimizes  T  for  B  (q  „)  +  e 

^new'  q  max  ^st 

where  t  is  arbitrarily  small. 

Couent:  This  step  identifies  the  best  estimate  of  the  bias  bound 

associated  with  the  model  of  order  q  The  identification  of  q 
provides  additional  information  which  can  be  used  to  validate  this 
estimate  as  shown  in  Step  5. 


Step  4:  Bias  Verification  —  Part  1 

Procedure:  For  the  estimated  model  of  order  q  calculate 

Mst 

2 

BL  ■  K  <’st>  '  (1  -  »2L>  - CA3.2) 

Mst 

where  is  defined  by  the  relationship: 

2  2 

’IK  '  <1  -  'L1  '  %  <qst>  ....(*5.3) 

Mst  Hst 

*2  2 

If  B  (q  )  is  greater  than  B.  ,  proceed  to  Step  5.  If  not,  set  q 
max  st  L  s  i 

equal  to  the  next  element  in  0  and  return  to  Step  2. 


Comment:  Step  4  is  simply  a  replay  of  Step  2  using  a  different  lower 

,2  _ _ -2 


bound  for  B 


.  ...  In  particular,  B.  . .  is  defined  by: 

^qst  'qst  l) 


V-U  -  6  <qs.>  '  «  - 

As  a  result,  the  following  condition  must  be  true  for  any  value  of  p  in 


’l  -1  e*<qs,-1>  '  <>  -  ’l  -1> 

MSt  MSt 


the  range  0  <  p  <  p  ^ : 


s  t 


J. 


B(qst-1)  -  °2  '  U  -  “2) 

-.2 


Now  if  q  .  =  q.  then  0  (q  „)  is  less  than  or  equal  to  B,  , .  and 

Mst  Mbest  Mst'  M  (qst-l) 

calculating  p^  (per  eqn  A5.3)  will  produce  a  value  for  p  in  the  range 

2 

0  <  p  <  p  ..  In  this  case,  B,  (defined  by  eqn  A5.2)  will  necessarily 
qst  1 

2  '2  2 

be  less  than  or  equal  to  B.  So  if  B  (q  „)  <  B,  ,  then 

qst"^  max'  st  L 

*  2  2 

B  (q  )  cannot  be  a  valid  upper  bound  for  B,  ...  Thus  by  Result 
max  st  \Q  / 

s  t 

5.6,  qgt  cannot  be  qjjest  and  the  search  for  must  proceed  to  the 

next  element  of  Q. 
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Step  5:  Bias  Verification  —  Part  2 

Procedure:  For  the  estimated  model  of  order  n  ,  calculate 

'st 

2  -*2  -*2 

Bf  =  0  (q  +1)  +  • • •  +  0  (q  ) 

L  q  'Mnev  '  q  'Mst 

Mst  Mst 


. . . . (A5.4) 


*  2  2 

If  B  (q  )  is  greater  than  B  ,  proceed  to  Step  6.  If  not,  set  q 

max  st  L  s  i 

equal  to  the  next  element  in  0  and  return  to  Step  2. 

Comment:  For  any  q  in  the  range  qQ  <  q  <  q^  ,  T  will  be  less  than  T 

es  -  q 

"2  2 

for  all  q  <  q  when  B  =  B  .  (This  result  is  established  by  Result  6.3 


in  Chapter  6. 

•  ) 

Since 

B2 

max 

(q  „)  +  c  minimizes  T 
Mst'  q 

at  q  , 

Mnew 

*  2 

Bmax((1st)  mUSt 

therefore  be 

greater 

than 

2 

or  equal  to  B  for 

qst  to  be 

Vsf  But’ 

^new 

2 

00 

*2  q?t 

*2 

B 

n 

> 

z 

8  (i)  >  Z 

e  (i) 

*nev  i=q  +1 
Mnev 


i=q  +1 
Mnev 


■  2  2  "  2  2 

So  if  B  (q  „)  <  B,  ,  B  (q  1  is  not  a  valid  upper  bound  for  B 

max  Mst  L  maxVHst'  qnew 

*  2 

B  (q  „)  must,  therefore,  be  an  erroneous  estimate  of  bias.  Hence  by 
max  Mst 

Result  5.6,  q  cannot  be  q,  and  the  search  for  q,  .  must  proceed  to 
Mst  ^best  best 

the  next  element  of  Q. 


Step  6:  Final  Result 

-  2 

Based  on  all  available  information,  B  (q  ,)  is  a  valid  upper  bound  for 

max  ‘st  r 

2 

B  Since  all  elements  of  Q  greater  than  qst  have  already  been 

qst” 

eliminated,  Result  5.6  implies  that  qs(  is  the  optimal  truncation  level. 
%est  ’ 
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CHAPTER  SIX 


COMPLETE  DESCRIPTIONS  OP  FREOIJENCY  RESPONSE  UNCERTAINTY 
_ FOR  SISO  AND  MIMO  SYSTEMS _ 

Results  from  the  previous  three  chapters  clearly  suggest  that 
generating  an  optimal  description  of  frequency  response  uncertainty  is  a 
multifaceted  problem.  Indeed,  an  appropriately  accurate  description  must 
account  for  both  the  statistical  uncertainty  associated  with  the  estimates 
obtained  from  the  identification  process  and  any  bias  introduced  by  this 
process.  Just  as  importantly  for  control  system  analysis  and  design,  the 
resulting  uncertainty  description  must  account  for  the  interfrequency 
dependence  of  the  estimates  so  that  it  is  valid  simultaneously  over  all 
frequencies  and,  hence,  can  be  included  in  an  analysis  based  on  classical 
frequency  response  techniques.  Furthermore,  when  parametric  models  are 
employed  to  derive  the  description,  other  concerns  such  as  the  selection  of 
an  appropriate  model  structure  and  the  identification  of  the  "best"  model 
within  the  specified  class  must  also  be  addressed  to  achieve  the  desired 
accuracy.  Finally,  a  truly  optimal  system-specific  description  of  frequency 
response  uncertainty  can  only  be  produced  if  there  exists  a  means  of 
tailoring  the  description  to  the  specific  frequency  response  characteristics 
of  the  system  under  investigation. 

All  of  the  information  needed  to  solve  this  multifaceted  problem  for 
SISO  systems  has  now  been  developed.  Furthermore,  this  information  has  been 
produced  in  a  format  that  can  be  readily  extended  to  multivariable  systems. 
Hence,  the  originally-specified  problem  of  characterizing  multivariable 
frequency  response  uncertainty  can  also  be  solved  in  a  straightforward 
manner.  The  goal  of  this  chapter  is  to  consolidate  the  concepts  establish'd 
in  the  previous  chapters  into  a  coherent  procedure  for  the  characterization 
of  the  frequency  response  uncertainty  associated  with  both  SISO  and  MIMO 
systems.  The  development  begins  by  combining  the  statistical  information 
established  in  Chapter  3  with  the  optimal  truncation  results  proposed  in 
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Chapter  5  to  produce  a  complete  description  of  frequency  response 
uncertainty  (which  includes  the  effects  of  both  estimate  variability  and 
bias)  for  SISO  systems.  Further  refinements,  based  on  the  frequency 
response  characteristics  of  the  given  system,  are  then  identified  to 
optimize  the  uncertainty  description  for  analysis  purposes,  and  an  algorithm 
is  presented  to  summarize  the  steps  in  the  uncertainty  characterization 
process.  Next,  procedures  which  extend  the  SISO  frequency  response 
uncertainty  description  to  MIMO  systems  are  established.  The  resulting 
multivariable  description  of  uncertainty  is  then  combined  with  the 
structured  uncertainty  techniques  described  in  Chapter  2  to  produce  a 
frequency  response  description  of  the  perturbed  multivariable  system  that 
can  be  readily  used  in  a  generalized-Nyquist  assessment  of  robust  stability 
and  performance.  Finally,  examples  are  presented  to  demonstrate  the 
implementation  of  the  proposed  procedures  for  both  SISO  and  MIMO  systems. 

6.1  An  Optimal  Frequency  Response  Uncertainty  Description  for  SISO  Systems 

6.1.1  Combining  Statistical  Uncertainty  and  Bias 

As  implied  by  previous  discussions,  the  combined  effects  of  estimate 

variability  and  truncation  bias  on  system  uncertainty  can  be  examined  in 

detail  by  rewriting  the  z-domain  transfer  function  as: 
q  .  o° 

g(z)  =  z  g.  z-1  +  z  g.  z-1  =  gT ( s ; q 1  +  g£[z;ql  — (6.1) 

i=l  i=q+l 

Eqn  6.1  conveniently  separates  frequency  response  uncertainty  into  two 
distinct  parts:  the  uncertainty  associated  with  the  variability  of  the 

parameter  estimates  used  to  describe  gT|z;q]  and  the  unknown  bias  g£[z;q). 
This  characterization  suggests  that  a  complete  description  of  frequency 
response  uncertainty  (one  that  is  valid  simultaneously  over  all  frequencies) 
can  only  be  attained  if  both  of  these  elements  are  identified  and 
quantified.  Using  the  results  of  Chapters  3  and  5,  the  task  of  generating 
this  complete  description  can  now  be  accomplished. 
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The  statistical  bounds  identified  by  Theorem  3.3  provide  a  quantitative 
and  nonasymptotic  description  of  the  frequency  response  uncertainty 
associated  with  g^.[z;q].  In  addition,  the  truncation  criterion  described  by 
Results  5.8  and  5.9  provides  a  valid  upper  bound  on  the  frequency  response 
bias  introduced  by  truncation  over  all  frequencies.  Together,  these  results 
can  be  used  to  generate  a  valid  set  of  frequency  response  uncertainty 
regions  as  described  by  the  following  result: 


Result  6.1:  Let  gT[z;qbes{.J  denote  the  z-domain  representation  for  the 

estimated  weighting  sequence  model  of  order  .  For  any  specified 

frequency  (<oT)  and  confidence  level  (a),  let  R(wT;a;qbes(.)  denote  the 
confidence  region  for  g,p{exp(ju>T);qbest.]  which  is  centred  at 
g^,[exp( jwT) ?q^est I  and  whose  boundary  is  described  by  eqn  3.19.  Then  the 
true  system  frequency  response, 

g[exp(jwT)J  =  gT[exp(j»T);qbest]  +  gRlexp( jwT) 5  qbest 1  - (6-2) 

lies  in  the  region  RR(«T;a;qbest)  defined  by  extending  the  boundary  of 

R(«T;a;qbest)  in  all  directions  by  BRR  ^best^  (defined  by  Result  5.9). 


Proof:  From  Result  5.12  (with  Zq  =  1),  it  is  clear  that  Bpg^best^  (e9n 

5.26)  defines  an  upper  bound  for  |gR[exp(jwT) ;9bestl I  over  all  frequencies 

0  <  wT  <  n.  By  Result  5.9,  BRR  (^best^  *s  a  valid  upper  bound  for 

max 

BFR(qbest)  and’  hence’  for  |gElexp(jwT);qbest) I •  As  a  result,  eqn  6.2 


implies  that  the  true  frequency  response,  g[exp( jwT) ) ,  must  lie  on  or 


inside  a  circle  of  radius  BRR  (^best^  centreci  at  gT|exp(  jwT)  *qbest  ^ ' 

max 

But  for  the  specified  level  of  confidence,  Theorem  3.3  demonstrates  that 
gTlexp(jcoT)  ;qbestl  lies  in  R(wT;a;qbest).  Hence,  glexp(jcoT)]  must 

necessarily  lie  in  RR((oT;a;qbest). 


.... QED 


IJ 

J 


As  the  truncation  procedure  derived  from  Results  5.8  and  5.9  cannot  generate 
phase  information  on  the  bias,  each  region  established  by  Result  6.1 
provides  the  best  available  information  on  the  location  of  the  true  system 
frequency  response  at  the  specified  frequency.  In  addition,  the  regions 

{RE(wT;a;qkest>;  0  <  «T  <  it}  are  valid  simultaneously  since  BRR  ^best^ 

max 

defines  an  upper  bound  on  the  bias  at  each  and  every  frequency.  Hence, 
these  extended  regions  not  only  quantify  the  effects  of  both  estimate- 
variability  and  bias,  but  they  also  continue  to  account  for  the 
interfrequency  dependence  of  the  estimates. 

The  use  of  a  single  frequency  response  bias  bound,  however,  may  produce 
an  unnecessarily  conservative  estimate  of  bias  over  certain  frequency  ranges 
as  described  in  Section  5.3.2.  If  this  occurs,  the  uncertainty  regions 
defined  by  Result  6.1  will  also  be  conservative.  In  this  case,  Results  5.10 
and  5.11  clearly  demonstrate  that  refined  frequency-dependent  bias  bounds 
may  be  introduced  to  reduce  the  conservatism  in  the  bias  estimates.  The 
resulting  frequency-dependent  bias  bounds  can  then  be  combined  with  the 
available  statistical  information  to  generate  a  much  sharper  overall 
description  of  frequency  response  uncertainty  as  demonstrated  by  the 
following  result: 


Result  6.2:  Let  Sx^’^best ^  denote  the  z-domain  representation  for  the 

estimated  weighting  sequence  model  of  order  Q j,es t -  For  any  specified 

frequency  (coT)  and  confidence  level  (a),  let  R(wT; cx; q^es t )  denote  the 

confidence  region  for  gT(exp( jwT) ;qbgst ]  defined  in  Result  6.1  and  let 

BpR  (z’%est^  represent  the  frequency-dependent  upper  bound  on  frequenev 
max 

response  bias  (described  in  Result  5.11)  associated  with  the  given  model 


at  z  =  exp(jwT).  Then,  the  true  system  frequency  response  at  «T  (eqn  6.2) 

lies  in  the  region  RE^(«T;a;qbest)  defined  by  extending  the  boundary  of 

R(wT;o;qbest)  in  all  directions  by  BpR  (z^best*' 

max 
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Proof:  By  Result  5.11,  BpR  ^z;qbest^  *s  a  va^*d  upper  bound  for  the 

max 

frequency  response  bias  introduced  by  truncation  at  z  =  exp(joiT).  Hence, 
the  proof  of  this  result  is  identical  to  that  of  Result  6.1  with 


3FR  (Vst>  rePlaced  by  BFR  < 
max  max 


.... QED 


One  important  implication  of  Result  6.2  is,  of  course,  that  the 
uncertainty  bounds  of  Result  6.1  can  be  immediately  reduced  over  a  wide 
range  of  frequencies.  Indeed,  using  the  bias  bounds  identified  by  Result 
5.11,  significant  improvements  can  generally  be  obtained  over  the  entire 
frequency  range  (0  <  wT  <  n)  for  non-oscillatory  systems;  while,  for 
oscillatory  systems,  the  improvements  will  be  limited  to  specific  frequency 
ranges  sufficiently  far  from  the  dominant  resonant  frequency.  It  is  also 
interesting  to  note  that  Results  6.1  and  6.2  are  not  restricted  only  to  the 
truncation  level  defined  by  Q^est'  valid  upper  bounds  on  frequency 
response  bias  are  available  for  any  arbitrary  truncation  level  q,  Results 
6.1  and  6.2  can  both  be  immediately  extended  to  define  valid  frequency 
response  uncertainty  regions  for  this  q-parameter  weighting  sequence  model 
as  well.  As  shown  in  the  following  section,  this  observation  can  be  used  to 
generate  a  more  refined  multifrequency  description  of  uncertainty  by 
redefining  the  optimal  truncation  level  in  terms  of  the  frequency  response 
characteristics  of  the  test  system. 


6.1.2  Estimate  Uncertainty  vs  Bias:  An  Optimal  Trade-off 

Returning  for  a  moment  to  the  developments  of  Chapter  5,  it  can  be  seen 
that  the  optimal  truncation  level,  q^  ,  identified  by  Results  5.5  and  5.6 
was  selected  based  on  Definition  5.1.  This  definition,  in  turn,  relies  on 
the  AIC  criterion  (eqn  5.1)  to  define  the  optimal  model  order.  But  AIC  was 
developed  to  establish  an  optimal  trade-off  between  bias  and  variability  in 
the  parameter  space.  Thus,  although  q^est  de^nes  the  optimal  parametric 
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model  for  the  test  system,  it  does  not  necessarily  correspond  to  the  model 
which  is  optimal  for  frequency  response  applications  because  the  criterion 
used  to  identify  ^best  fails  to  consider  the  frequency  response 
characteristics  of  the  system.  As  a  result,  the  overall  frequency  response 
uncertainty  regions  described  by  Results  6.1  and  6.2,  though  valid,  may  not 
be  optimal  for  the  given  system. 

To  establish  an  optimal  truncation  for  frequency  response  applications, 
the  bias/variability  trade-off  must  necessarily  be  examined  in  the  frequency 
domain.  This  suggests  that  both  the  frequency  response  bias  and  the 
variability  of  the  frequency  response  estimates  associated  with  various 
truncation  levels  must  be  quantified.  Variability  can,  in  fact,  be 
quantified  for  any  given  truncation  using  the  results  of  Chapter  3.  But  so 
far,  valid  bias  information  has  only  been  identified  for  a  single  truncation 
level  (that  defined  by  Q^est^’  **  *s’  h°wever>  possible  to  extend  the  bias 
identification  developments  of  Chapter  5  to  other  truncation  levels  as 
demonstrated  here: 


Result  6.3:  Let  q  denote  a  specified  model  order  in  the  range 

2 

qQ  <  q  <  Q^est’  an<*  ®  represent  the  upper  bound  on  "parameter"  bias 

JCS  q-i 

associated  with  the  model  of  order  q-1  (eqn  5.21).  In  addition,  redefine 
the  scale  factor  p^  (for  any  given  q)  such  that 


~2  *  ~2 
Pq  0  (q)  /  (1  -  pp 


Then,  the  quantity  T^  defined  by: 


o 

B* 

q-1 


T  =  AIC 

<0  <0 


....(6.3a) 


T  =  AIC 

q  q 


^  *  r  n  ~2  ~  t  ~  n 

+  I  IP-  9  (i)  dj(i)  d1(i )  ]  l;  q  >  qQ  ....(6. 

i=q0  V  c.  e.  ) 


3b) 


will  possess  a  unique  minimum  for  all  q  in  the  range  qQ  <  q  <  q,  and  this 
minimum  will  occur  at  q. 
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Proof:  For  weighting  sequence  models,  the  parameter  estimates  obtained 

for  any  model  in  the  range  q^  <  q  <  q^est  are,  by  definition,  accurate 

estimates  for  the  true  model  parameters.  Hence,  any  increase  in  model 

order  within  this  range  will  improve  the  fit  to  the  observed  data  and 

simultaneously  improve  the  description  of  the  system  being  identified.  As 
a  result,  the  "best"  order  for  any  restricted  class  of  model  orders 
Qq  <  Q  <  Q  (where  q  is  less  than  q^  t)  must  necessarily  correspond  to  the 

largest  model  in  the  class,  q. 

Since  q^  <  q  <  q^^,  Assumption  5.1  and  Result  5.3  are  valid  for  all 

(2) 

q  in  the  range  q^  <  q  <  q,  and  AIC^  (eqri  5.20)  can  be  used  to  identify 

the  "best"  model  within  this  restricted  class.  But,  the  best  model  is  q. 

(2) 

So  by  Result  5.4,  AIC^  will  be  a  minimum  at  q  for  all  models  in  this 
class.  Now,  the  proof  of  Result  5.5  can  be  used  to  demonstrate  that 
T  will  be  less  than  T  for  all  q  <  q. 

q 

q 

.... QED 

It  is  important  to  note  that  Result  6.3  is  essentially  a  restatement  of 
Result  5.5  where  knowledge  of  ^ t>es t  ^as  keen  used  to  define  a  more 
restricted  class  of  weighting  sequence  models  that  still  satisfy  Assumption 
5.1.  As  such,  the  set  of  results  derived  from  Result  5.5  for  the  model 
class  can  now  be  extended  to  t lie  restricted  model  class  defined  by 

qg  <  q  <  q  for  ar,y  given  q  in  the  range  q()  <  q  <  q^  .  More  specifically, 

•  2 

Result  5.6  can  be  extended  to  establish  the  quantity  B  (q)  as  a  valid 

M  J  max  M 

2 

upper  bound  for  B'  .  Results  5.8  and  5.9  can  be  extended  to  establish 

q-1 

*2  - 

Bpd  (q)  as  a  valid  upper  bound  for  the  frequency  response  bias  associated 

r  K 

max 

with  the  model  of  order  q  over  all  frequencies  0  <  wT  <  n.  And  the 

frequency-dependent  modifications  of  Results  5.10  and  5.11  can  be  extended 
*2 

to  establish  Bp^  (z;q)  as  a  valid  upper  bound  for  the  frequency  response 


bias  associated  with  the  model  of  order  q  at  z  =  exp(jwT).  Hence,  the 

truncation  algorithm  proposed  in  Chapter  5  can  also  be  used  to  identify 

valid  bias  bounds  for  more  severe  truncations  than  q^  simply  by 

restricting  the  identification  of  minimum  T  to  the  smaller  classes  of 

q 

models  defined  by  <  q  <  q  (where  q  is  less  than  q|jest)- 

With  this  additional  bias  information  now  available,  an  optimal 
frequency-dependent  truncation  can  be  identified  in  the  following  manner: 

Criterion  6.1:  For  any  specified  frequency  w^T,  the  optimal  truncation 

corresponds  to  the  model  order  qopt(wjT)  in  the  range  <  q  <  qjjest  which 
minimizes 


■  bfr  <V>  *  J  'Vq  \  V“lT»l 

max  ^  mi 

where  BRR  (z^jq)  represents  a  valid  upper  bound  on  the  frequency 
max 

response  bias  at  w^T  for  the  model  of  order  q,  represents  the  parameter 

estimate  covariance  matrix  for  the  model  of  order  q  (eqn  3.4b),  and 
T 

H  (w.T)  and  0  are  defined  in  Sections  3.1.2  and  3.2.1  respectively, 
q  1  a,  q 

Comment:  For  any  given  q,  the  maximum  distance  between  g|exp(jWjT)]  and 

g^, [ exp(  j  WjT) ; q  ]  is  bounded  by: 

|  Ag(exp(  jtOjT)  1 1  =  lg  -  gTl  =  |gT  gT  +  gEl  <  lgT  -  gTl  +  Iggl 


But  as  shown  in  Chapter  3,  |gT~  gTl  <  /  |Q  'max  *Hq(b>lT*  Vq  Hq(“lT^' 

Furthermore,  |gR|  <  BRR  (z^;q),  and  BRR  tZjjq)  is  known  for  all  q  in 

u  max  max 

the  range  qn  <  q  <  q,  ...  Hence,  Ag  (w.T)  (eqn  6.4)  can  be  generated 
u  Dps t  max  i # 

for  each  given  model  order  and  establishes  a  valid  upper  bound  fnr 
| Ag[exp( j»jT) ] | .  The  optimal  truncation,  therefore,  clearly  cot  responds 
to  the  model  which  minimizes  Agmax(WjT)  o'er  all  models  q  in  the  range 

q0  -  q  -  qbest ‘ 
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Criterion  6.1  produces  a  truncation  which  minimizes  the  frequency  response 
uncertainty  bound  at  a  specified  frequency.  Furthermore,  once  this  new 
truncation  is  specified,  the  frequency  response  extensions  of  Result  6.3 
(highlighted  above)  can  be  used  to  generate  a  complete  set  of  frequency- 
dependent  bias  bounds,  Bp^  (z;qQ  [)•  Hence,  a  new  set  of  (modified) 

uncertainty  bounds  can  be  developed  using  Result  6.2.  It  should,  however, 
be  noted  that  boundary  reductions  achieved  using  Criterion  6.1  will 
generally  be  restricted  to  specific  frequency  ranges.  Indeed,  the  new 
truncation  may  actually  produce  larger  uncertainty  bounds  at  other 
frequencies.  For  this  reason,  it  is  important  to  select  the  frequency  to  be 
used  in  Criterion  6.1  based  on  the  general  frequency  response 
characteristics  of  the  test  system.  When  this  is  done,  reductions  in 
boundary  size  can  be  obtained  over  the  range  of  critical  frequencies  while 
restricting  boundary  increases  to  frequencies  where  an  accurate  knowledge  of 
system  frequency  response  is  less  important. 

6.1.3  Optimal,  System-Specific  Frequency  Response  Uncertainty  Bounds 

As  suggested  above,  the  techniques  of  the  previous  chapters  can  be  used 
in  a  variety  of  ways  to  modify  and  improve  the  ultimate  description  of 
frequency  response  uncertainty.  In  fact,  these  techniques  establish  a 
useful  means  of  tailoring  this  description  to  the  specific  system  under 
investigation.  Although  an  optimal  description  of  frequency  response 
uncertainty  for  any  given  system  obviously  depends  on  the  unique 
characteristics  of  that  system  and  the  particular  applications  being 

considered,  the  steps  required  to  produce  this  description  may  be  summarized 

« 

as  shown  in  the  following  procedure. 

Procedure  6.1: 

1.  For  any  given  set  of  test  data,  use  the  truncation  criterion  of 
Results  5.5  and  5.6  to  identify  the  model  order  Q^est* 
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2.  For  this  specified  truncation,  implement  the  algorithm  implied  by 

Results  5.8  and  5.9  to  identify  Bp^  ^best^*  Incorporate  this  bias 

max 

bound  into  the  statistical  description  of  uncertainty  associated  with 
the  given  model  as  shown  in  Result  6.1.  Identify  the  critical 

frequencies  of  the  system  using  this  initial  uncertainty  description. 

3.  Implement  the  algorithm  implied  by  Results  5.10  and  5.11  to  establish 

improved  bias  bounds  at  preselected  individual  frequencies.  Combine 
these  frequency-dependent  bias  bounds  with  the  confidence  regions  for 
the  given  model  as  shown  in  Result  6.2  to  obtain  a  complete 

description  of  frequency  response  uncertainty  for  the  system. 

4.  If  refinements  are  required  over  the  range  of  critical  frequencies, 
use  Criterion  6.1  to  select  the  truncation  level,  q  t,  which 
minimi2es  the  maximum  uncertainty  bound  at  the  desired  frequency. 
For  q  ,  identify  frequency-dependent  bias  bounds  for  all  frequencies, 
0  <  wT  <  n,  using  the  frequency-domain  extensions  of  Result  6.3. 
Combine  these  new  bias  bounds  with  the  confidence  regions  for  the  new 
model  as  shown  in  Result  6.2  to  obtain  an  optimal  system-specific 
description  of  frequency  response  uncertainty. 

Procedure  6.1  provides  the  general  framework  for  establishing  a  description 
of  frequency  response  uncertainty  that  is  tailored  to  the  system  under 

investigation.  This  description,  in  turn,  provides  the  information  required 
to  assess  the  effects  of  uncertainty  on  closed-loop  stability  and 
performance.  An  example  which  highlights  the  development  of  this 
uncertainty  description  is  provided  in  Section  6.3.1. 

6.2  Extending  the  Uncertainty  Description  to  MIMO  Systems 

6.2.1  Joint  Uncertainty  Bounds  for  the  Elements  of  G(z) 

Since  each  element  of  an  [m  x  mj  system  transfer  function  matrix  G(z) 
can  be  modelled  as  a  scalar  weighting  sequence,  SISO  results  from  the 
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previous  section  can  be  adapted  directly  to  the  multivariable  problem.  To 

accomplish  this  task  however,  uncertainty  information  on  the  individual 

elements  of  G(z)  must  be  combined  appropriately.  Clearly,  a  specified  level 

of  confidence  on  the  complete  set  of  parameters  associated  with  a  given 

multivariable  transfer  function  matrix  implies  that  the  parameters  of  all 

elements  of  the  matrix  must  lie  within  their  individual  confidence  regions 

simultaneously.  For  the  case  where  inputs  are  applied  separately  and 

outputs  are  measured  independently,  the  element-by-element  parameter 

estimates  are  independent  and  the  probability  associated  with  the  complete 

set  of  confidence  regions  is  simply  the  product  of  the  individual 

probabilities.  Thus,  an  a  x  100%  confidence  bound  for  the  parameters  of  the 

multivariable  system  can  be  established,  for  example,  by  generating 
1/m2 

a  x  100%  confidence  bounds  on  each  of  the  individual  elements.  When  this 
is  done,  the  multivariable  uncertainty  problem  can  be  completely  analyzed  in 
terms  of  the  uncertainties  associated  with  the  scalar  elements  of  G(z). 

It  should  be  noted  that  the  independence  of  the  parameter  estimates 
introduces  additional  flexibility  in  defining  the  desired  uncertainty 
information.  For  instance,  different  confidence  levels  can  be  selected  for 
the  various  elements  of  G(z)  (and  hence,  tiie  size  of  the  confidence  bounds 
associated  with  these  elements  can  be  adjusted)  provided  the  product  of  the 
probabilities  remains  fixed.  This  relationship  can,  in  fact,  be  used  to 
improve  the  analysis  of  perturbed  system  behaviour  even  further  as  will  be 
described  in  Section  6.2.3.  In  addition,  the  existence  of  independent 
parameter  estimates  implies  that  the  truncation  criteria  developed  in 
Chapter  5  can  be  directly  applied  to  derive  the  bias  bound  estimates 
required  to  finalize  the  desired  description  of  uncertainty  for  each  element 
of  G(z) .  Thus,  a  complete  element-by-element  uncertainty  characterization 
can  be  produced  and,  as  shown  in  the  following  section,  this 
characterization  can  be  transformed  into  an  appropriate  frequency  response 
description  of  the  perturbed  multivariable  system. 


6.2.2  E-Contour  Bounds  for  the  Characteristic  Loci 

Having  defined  an  appropriate  statistical  limit  on  the  parameters  of 
any  given  element  of  G(z),  a  frequency  response  confidence  bound  (eqn  3.19) 
can  be  established  at  any  specified  frequency  wT.  Once  this  statistical 
bound  is  combined  with  the  corresponding  bias  bound,  the  maximum  distance 
from  the  nominal  (estimated)  frequency  response  to  the  boundary  of  the 
uncertainty  region  can  be  calculated  using  eqn  6.4.  Hence,  the  difference 
between  the  true  frequency  response  of  element  i,j  (g„)  and  its  nominal 
value  (g„)  is  bounded  by: 

|g°i[exp(jwT)]  -  g..[exp(jwT);q] |  <  Ag..  (wT)  - (6.5) 

J  J  Jmax 

where  Ag. .  («T)  is  defined  by  eqn  6.4.  Using  the  additive  perturbation 

1Jmax 

characterization  discussed  in  Section  2.2,  Ag. .  (wT)  becomes  an  element  of 

1-*max 

the  structured  perturbation  matrix  P  at  the  selected  frequency.  The 
procedure  above  can  be  repeated  for  each  frequency  and  each  element  of  G(z) 
to  produce  a  complete  description  of  multivariable  uncertainty  in  terms  of 
the  set  of  structured  perturbation  matrices,  (P[exp(jwT) ] ;  0  <  wT  <  it) . 
Because  the  frequency  response  confidence  bounds  are  elliptical,  the  maximum 
distance  bounds  used  to  generate  P  at  any  given  frequency  may  produce  a 
conservative  assessment  of  uncertainty.  This  conservatism  can,  however,  be 
eliminated  by  generating  circular  bounds  at  desired  frequencies.  Two 


particularly  useful  techniques  to  achieve  this  goal  (input  selection  and 
parameter  weighting)  have  already  been  presented  and  discussed  in  Chapter  3. 

Given  the  structured  uncertainty  characterization  defined  above, 
structured  E-contour  bounds  on  the  characteristic  loci  of  the  perturbed 
system  can  be  constructed  as  proposed  by  Kouvaritakis  and  Latchman  [KT*U4|. 
(K0U6).  It  is  important  to  note  that  this  E-contour  procedure  produces 
necessary  and  sufficient  bounds  for  the  characteristic  loci  of  the  perturbed 
multivariable  system  (i.e.  the  established  bounds  are  attainable  maximum 
bounds  for  the  characteristic  loci  based  on  the  specified  perturbation 
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matrix).  This  does  not,  however,  imply  that  the  characteristic  loci  of 
systems  with  perturbations  outside  the  specified  class  lie  outside  these 
bounds.  As  a  result,  the  a  x  100%  confidence  established  at  the  parameter 
space  level  forms  a  lower  bound  on  the  confidence  associated  with  these 
characteristic  loci  bounds.  However  within  the  framework  of  the  given 
problem,  these  bounds  provide  the  best  possible  description  of  system 
uncertainty  and,  because  they  do  contain  the  perturbed  characteristic  loci, 
these  bounds  can  be  used  in  a  generalized-Nyquist  analysis  to  assess  robust 
stability  and  performance. 

6.2.3  Final  Adjustments  to  Improve  the  Uncertainty  Description 

In  any  stability  assessment  or  gain/phase  margin  determination,  the 
frequencies  of  primary  interest  are  those  where  the  characteristic  locus 
bounds  are  closest  to  the  critical  point  in  the  complex  plane  ([-1,0]  for 
unity  feedback  systems).  Hence,  techniques  which  tailor  the  description  of 
system  uncertainty  to  produce  tighter  characteristic  locus  bounds  at  the 
critical  frequencies  will  ultimately  produce  a  more  effective  analysis  of 
system  behaviour.  As  mentioned  previously,  test  input  selection  and 
parameter  weighting  (both  described  in  Section  3.4)  as  well  as  the 
identification  of  frequency-dependent  bias  bounds  (described  in  Section 
5.3.2)  can  be  used  for  this  purpose.  An  additional  technique  is  suggested 
by  the  availability  of  independent  parameter  estimates  for  each  of  the 
elements  of  G(z). 

The  E-contour  procedure  for  element-by-element  bounded  perturbations 
relies  on  an  'optimal  cross-condition  number'  argument  ‘o  develop 
appropriate  characteristic  locus  bounds.  As  the  optimal  cross-condition 
number  is  obtained  using  scaling  matrices  (L  and  R  in  eqn  2.23),  it  may  be 
possible  to  adjust  P  at  any  given  frequency  (by  altering  the  individual 
confidence  bounds)  such  that  the  resulting  cross-condi tion  number  is 
reduced.  In  this  case,  the  corresponding  E-contours  would  also  be  reduced. 
Because  of  nonlinearities  in  the  relationships  between  the  cross-condition 
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number  and  the  elements  of  P  as  well  as  between  specified  confidence  levels 
and  their  corresponding  confidence  limits,  it  is  not  possible  to  produce  an 
analytic  solution  for  the  appropriate  adjustments  in  P.  However, 
characteristic  locus  boundary  improvements  may  be  obtained  by  adjusting  the 
individual  frequency  response  bounds  according  to  the  following  procedure: 

Procedure  6.2: 

1.  Generate  E-contour  bounds  assuming  equal  confidence  levels  on  all 
elements  of  G(z). 

2.  At  the  desired  frequency,  alter  each  element  of  P  individually  and 
generate  new  E-contours  to  quantify  the  sensitivity  of  the  E-contours 
to  changes  in  the  individual  elements  of  P. 

3.  Using  this  sensitivity  assessment,  decrease  the  confidence  limits  on 
the  elements  of  G(z)  which  display  high  sensitivity  and  increase  the 
limits  on  those  which  display  low  sensitivity  (using  the  desired 
multivariable  confidence  level  as  the  guide  to  allowable  changes)  to 
achieve  reductions  in  the  E-contours. 

Reductions  in  the  E-contour  bounds  obtained  from  this  procedure  will 
necessarily  depend  on  the  nature  of  the  elements  of  G{exp(  jcoT)} .  In  many 
cases,  it  may  not  be  possible  to  obtain  significant  improvements  due  to 
limitations  on  the  allowable  reductions  in  individual  confidence  bounds 
imposed  by  the  specified  overall  confidence  level.  For  these  situations, 
the  equal  confidence  bounds  described  in  Section  6.2.1  should  be  used. 

6.3  Simulation  Examples  and  Discussion 
6.3.1  Results  for  SISO  Systems 

To  demonstrate  the  development  of  frequency  response  uncertainty  bounds 
using  the  techniques  summarized  in  Procedure  6.1,  simulation  data  was 
generated  based  on  the  following  discrete-time  system  transfer  function: 
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g(z)  = 


.29  ( z+. 0046 ) (z+. 758 )(z~. 867) 
z  (z-. 368)  (z-.827)(z-.875) 


■ 29(z3  -  . 105z2  -  . 658z  -  .003) 
z4  -  2.0707z3  +  1.3508z2  -  .2665z 

Pseudo-random  zero-mean  normally-distributed  input  (u)  and  measurement  noise 

2  2 

(e)  sequences  with  an  input-signal-to-noise  ratio  (ffu/,CTe)  °f  16  were  used  to 
drive  the  difference  equation  model  defined  by  g(z),  and  1000  input/output 
data  samples  were  collected.  This  simulation  data  was  then  input  to  a 
parameter- recursive  least-squares  algorithm  to  estimate  the  weighting 
sequence  parameters  for  models  of  order  5  to  49,  and  the  resulting  estimate 
information  was  used  to  generate  frequency  response  uncertainty  information 
as  described  below. 

First,  the  refinement  procedure  described  in  Appendix  5.1  was 
implemented  to  identify  the  following  set  of  candidates  for  9jjest;: 

Q  =  {37,  36,  28,  27,  26,  •••,  5} 

Next,  the  truncation  algorithm  described  in  Appendix  5.2  was  used  to 

establish  the  point  of  optimal  truncation  (defined  by  Results  5.5  and  5.6) 

as  9(:)egt=  24.  In  contrast,  the  model  order  identified  using  minimum  AIC  was 

q  *  =  37,  while  the  model  order  identified  using  the  AIC  criterion 

AIC  6  a 

proposed  by  Bhansali  and  Downham  [BHA1]  was  =  26.  For  q|jest=  24,  the 

a 

estimated  upper  bounds  for  "parameter"  bias,  Bmax(%est^  ’  and  frequency 

response  bias,  BpR  ^best^’  obtained  as  byproducts  of  the  truncation 

max 

process  were  found  to  be  0.053  and  0.185  respectively  (compared  to  the  true 
"parameter"  bias  of  0.028  and  the  true  DC  frequency  response  bias  of  0.100). 
The  estimated  noise  variance  obtained  from  the  residuals  of  the  least- 

squares  fit  for  qjjegt  =  24  was  then  used  to  identify  the  parameter  estimate 
covariance  matrix,  V ^ and  V ^ ^  was  used  to  generate  the  frequency  response 
confidence  bounds  described  in  Chapter  3  for  a  confidence  level  of  95°'. 
These  confidence  bounds  were  combined  with  the  frequency  response  bias 

estimate  above  (as  suggested  by  Result  6.1)  to  produce  the  uncertainty 

bounds  displayed  in  Figure  6.1. 

In  addition  to  the  uncertainty  information  generated  above,  bias  bounds 
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24 


at  several  preselected  individual  frequencies  were  calculated  for  q^  = 

using  the  criterion  described  by  Results  5.10  and  5.11.  The  resulting  bias 

bounds,  B,,,,  (z;24),  and  maximum  uncertainty  bounds,  Ag  ,  for  this 

tK  ^max 

max 

frequency-dependent  refinement  are  displayed  in  Table  6.1,  and  the  improve¬ 
ments  over  the  bounds  based  on  a  single  frequency  response  bias  bound, 

(24),  are  noted.  The  frequency-dependent  bias  bounds  established  by 
‘'max 

this  procedure  were  also  used  to  alter  the  frequency  response  uncertainty 
regions  for  this  system,  and  these  regions  are  shown  in  Figure  6.2. 

A  closer  examination  of  these  uncertainty  regions  indicated  that  the 
bounds  closest  to  the  (-1,01  point  correspond  to  frequencies  in  the  range 
<oT  =  50°  to  wT  =  90°.  Criterion  6.1  was,  therefore,  used  to  identify  the 
truncation  level  which  minimized  the  maximum  uncertainty  bound  at  wT  =  60°. 
The  data  used  to  establish  this  new  truncation  level  is  presented  in  Table 

6.2.  This  data  clearly  demonstrates  that,  although  the  frequency  response 
bias  increases  for  more  severe  truncations,  the  combined  effects  of  bias  and 
variability  decrease  for  a  limited  number  of  more  severe  truncations. 
Indeed  for  this  example,  Ag  is  minimized  at  q  =  18.  Using  this  new 
truncation,  frequency  response  bias  bounds  were  identified  (using  the 
frequency  response  extensions  of  Result  6.3)  for  all  frequencies  shown  in 
Table  6.1,  and  these  new  bias  bounds  were  used  to  produce  the  modified 
uncertainty  regions  displayed  in  Figure  6.3.  A  further  comparison  of  the 
frequency-dependent  bias  and  maximum  uncertainty  bounds  for  models  of  order 

qbest  =  ^  an<*  ^opt  =  *s  Proyided  Table  6.3. 

The  frequency  response  uncertainty  bounds  displayed  in  Figures  6.1, 

6.2,  and  6.3  are  all  valid  bounds  for  the  true  frequency  response  of  the 
specified  system.  A  comparison  of  these  three  descriptions,  however, 
clearly  highlights  the  improvements  in  the  description  of  frequency  response 
uncertainty  that  can  be  obtained  using  the  refinements  proposed  in  Section 
6.1.  For  this  system,  the  introduction  of  frequency-dependent  bias  bounds 
and  the  implementation  of  a  frequency-dependent  truncation  criterion 
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Figure  6-1:  Frequency  Response  Uncertainty  Bounds 

(Model  Order  qbest  =  24;  Single  Bias  Bound) 


Figure  6.2:  Frequency  Response  Uncertainty  Bounds 

(Model  Order  qbg  *  24;  Frequency-Dependent  Bias  Bounds) 


wT 

(deg) 

bfr  <24> 

max 

bfr  <z;24) 

max 

percent 

reduction 

Ag 

6max 

(single) 

^max 

(multiple) 

percent 

reduction 

10 

.1849 

.1245 

32.7 

.3449 

.2845 

17.5 

20 

.1849 

.0750 

59.4 

.3579 

.2480 

30.7 

30 

.1849 

.0538 

70.9 

.3683 

.2372 

33.7 

40 

.1849 

.0428 

76.9 

.3672 

.2251 

38.7 

50 

.1849 

.0356 

80.7 

.3476 

.1982 

43.0 

60 

.1849 

.0305 

83.5 

.3532 

.1988 

43.7 

70 

.1849 

.0268 

85.5 

.3610 

.2029 

43.8 

80 

.1849 

.0241 

87.0 

.3602 

.1994 

44 . 6 

90 

.1849 

.0219 

88.2 

.3681 

.2050 

44.3 

Table  6.1:  Frequency-Dependent  Bounds  for  Model  Order  q^  =  24 


*max 

bfr  (z;q) 

max 

/  [Q 

a,  q 

X  (HTVH)] 
max'  /J 

Ag 

^max 

24 

.0305 

.1683 

.1988 

23 

.0305 

.1631 

.1936 

22 

.0372 

.1577 

.1949 

21 

.0409 

.1485 

.1894 

20 

.0409 

.1436 

.1845 

19 

.0467 

.1383 

.1850 

18 

.0547 

.1291 

.1838 

17 

.0692 

.1242 

.1934 

16 

.0873 

.1189 

.2062 

15 

.1008 

.1091 

.2099 

Table  6.2:  Maximum 

Uncertainty 

Bounds  at  wT  =  60° 

x  true  frequency 
response 


Imag 


Figure  6.3:  Frequency  Response  Uncertainty  Bounds 

(Model  Order  qQpt  =  18;  Frequency-Dependent  Bias  Bounds) 


,  bff 

(deg) 

(z ; 24)  BpR  (z; 18) 

max  max 

percent 

change 

Ag 

6max 

(q=24) 

Agmax 

(q=l8) 

percent 

change 

0 

.1849 

.2675 

+44.7 

.3924 

.4269 

+  8.8 

10 

.1245 

.1977 

+58.8 

.2845 

.3154 

+10.9 

20 

.0750 

.1340 

+78.7 

.2480 

.2659 

+  7.2 

30 

.0538 

.0988 

+83.6 

.2372 

.2391 

+  0.8 

40 

.0428 

.0776 

+81.3 

.2251 

.2134 

-  5.2 

50 

.0356 

.0640 

+79.8 

.1982 

.1890 

-  4.6 

60 

.0305 

.0547 

+79.3 

.1988 

.1838 

-  7.5 

70 

.0268 

.0480 

+79.1 

.2029 

.1805 

-11.0 

80 

.0241 

.0430 

+78.4 

.1994 

.1777 

-10.9 

90 

.0219 

.0392 

+79.0 

.2052 

.1774 

-13.5 

Table 

6.3: 

Uncertainty  Bound  Comparison 

(q.  *  24  vs  q  „ 

^best  ^opt 

=  18) 
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combined  to  reduce  the  maximum  uncertainty  bounds  by  nearly  50%  over  the 
range  of  critical  frequencies  identified  above.  These  results  translate, 
for  example,  into  an  improvement  in  gain  margin  (GM)  from  GM  =  1.27  (for 
^best  =  ^  anc*  a  single  bias  bound)  to  GM  =  1.65  (for  q^  t  =  24  and 
frequency-dependent  bias  bounds)  and,  ultimately,  to  GM  =  1.73  (for  qopt=  18 
and  frequency-dependent  bias  bounds).  Furthermore,  the  improvements 


obtained  for  q 


18  are  generated  at  the  expense  of  only  moderate 


increases  in  the  uncertainty  bounds  at  (less  important)  low  frequencies. 

In  general,  it  is  not  possible  to  compare  the  results  above  with  those 

obtained  using  other  model  selection  criteria  (such  as  AIC  and  the 

Bhansali/Downham  criterion)  because  these  other  parameter-space  criteria 

fail  to  address  the  problem  of  truncation  bias.  It  is  however  interesting 

to  note  the  differences  between  the  maximum  uncertainty  bounds  (bias  plus 

confidence  bound)  derived  for  qQpt  =  18  using  the  procedures  in  this  chapter 

and  the  confidence  bounds  (which  do  not  include  bias)  established  for 

qAIC  =  26  and  qAjc  =  37.  These  differences  are  presented  in  Table  6.4,  and 
a 

they  clearly  demonstrate  the  improvements  that  can  be  obtained  using  the 
algorithms  proposed  here.  Indeed,  even  though  the  comparison  is  not 
equitable  because  the  bounds  for  qQpt  =  18  include  bias  while  the  bounds  for 
qAIC  =  26  and  q^j(-  =  37  do  not,  the  use  of  qQpt  =  18  leads  to  refinements 
in  the  overall  description  of  frequency  response  uncertainty  over  nearly  all 
frequencies  when  compared  to  qAj^  =  37  and  over  most  of  the  critical  fre¬ 
quencies  when  compare  to  q^j^  =  26.  Thus,  the  uncertainty  bounds  developed 

a 

using  the  procedures  in  this  chapter  not  only  include  the  effects  of  bias, 
but  they  also  produce  a  more  refined  description  of  frequency  response 
uncertainty  than  that  available  when  other  truncation  criteria  are  used. 

The  set  of  results  presented  in  this  section  clearly  demonstrates  the 
accuracy  of  the  frequency  response  uncertainty  description  that  can  be 
obtained  using  the  procedures  described  in  Section  6.1.  They  also  highlight 


the  ability  to  tailor  the  uncertainty  description  to  the  frequency  response 
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wT 

(deg) 

Maximum 

Uncertainty 

Bound 
(q  =  18) 

Maximum 

Confidence 

Bound 

(q  =  26) 

percent 

difference 

Maximum 
Confidence 
Bound 
(q  =  37) 

percent 

difference 

10 

.3154 

.1702 

+46.0 

.2179 

+  30.9 

20 

.2659 

.1805 

+  32.1 

.2609 

+  1.9 

30 

.2391 

.2020 

+  15.5 

.2683 

-12.2 

40 

.2134 

.1943 

+  9.0 

.2629 

-23.2 

50 

.1890 

.1720 

+  9.0 

.2427 

-28.4 

60 

.1838 

.1813 

+  1.4 

.2666 

-45.0 

70 

.1805 

.1855 

-  2.8 

.2618 

-45.0 

80 

.1777 

.1882 

-  5.9 

.2579 

-45.1 

90 

.1774 

.  1970 

-11.0 

.2705 

-52.5 

Table 

6.4:  Uncertainty 

Bound  Comparisons  for  q 

=  18,  q  =  26, 

and  q  =  37 

characteristics  of  the  particular  system  under  investigation.  The 
implications  of  this  optimal  description  of  frequency  response  uncertainty 
for  the  analysis  of  closed-loop  behaviour  are  apparent. 


6.3.2  Results  for  MIMO  Systems 

This  section  presents  the  complete  development  of  multivariable 
frequency  response  uncertainty  bounds  for  a  two-input/two-output  system 
using  the  results  of  Section  6.2.  The  following  system  (taken  from  Tbrahim 
and  Munro  [ IBR1 ] )  was  used  in  the  example: 


G(s) 


14.97  (s+1 . 7) 
(s+10)  a(s) 

85.2  (s+1. 44) 
(s+10)  a(s) 


95150  (s+1. 898) 
(s+100)  a(s ) 

124000  (s+2.037) 
(s  +  100)  a(s) 


where  a(s)  =  s^  +  3.225s  +  2.525. 


Initial  investigations  indicated  a 


requirement  for  compensation  to  stabilize  the  closed-loop  system.  In 
general,  the  implementation  of  a  forward-loop  multivariable  compensator  will 
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combine  the  elements  of  the  original  open-loop  transfer  function  matrix  and 
mix  the  uncertainties  associated  with  each  element  of  the  system.  As  a 
result,  any  new  assessment  of  system  stability  or  performance  based  on  the 
original  uncertainty  characterization  will  be  conservative.  To  eliminate 
this  conservatism,  significant  alterations  to  the  open-loop  system  should  be 
introduced  prior  to  implementing  the  identification  and  uncertainty 
characterization  procedures  discussed  previously.  Fine-tuning  adjustments 
can  then  be  made  based  on  the  results  of  the  robustness  analysis.  For  this 
problem,  the  following  constant  precompensator  was  added  prior  to  conducting 
the  identification  tests: 

-3.1985  1.8484 

0.011  -0.00045 

For  simulation  purposes,  the  transfer  function  matrix  obtained  by 
combining  C,  and  K  was  transformed  to  the  corresponding  discrete- time 
representation  (of  plant  plus  zero-order-hold)  using  a  sample  interval  of 
0.1  seconds.  Test  data  was  generated  separately  for  each  input  (based  on 

the  resulting  discrete- t ime  model)  using  independent  zero-mean  input  and 
noise  sequences  with  variances  of  1.0  and  0.04  respectively.  One  thousand 
input/output  samples  were  collected  for  each  test.  The  uncertainty 
characterization  procedure  summarized  in  Procedure  6.1  was  then  implemented 
to  identify  proper  truncations  and  to  generate  appropriate  frequency 
response  uncertainty  bounds  for  each  element  of  G(z).  The  step-by-step 
results  of  this  procedure  are  summarized  here: 

1.  Using  the  truncation  criterion  of  Results  5.5  and  5.6,  model  orders 

^qbest^  29,  17,  14,  and  23  were  identified  for  elements  g^ , 

g2j  and  g22  respectively. 

2.  The  corresponding  single  frequency  response  bias  bound,  Bp^  ^best^’ 

max 

associated  with  each  of  these  truncations  was  obtained  using  Results  5.8 
and  5.9.  Bounds  of  0.067,  0.373,  0.186,  and  0.318  were  identified  for 
elements  g^,  g^,  g21  and  g22  respectively. 
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3.  Using  the  maximum  uncertainty  bounds  (bias  plus  confidence)  implied  by 

the  truncation  and  bias  from  steps  (1)  and  (2),  the  frequency- dependent 
structured  perturbation  matrix  was  constructed  and  E-contours  were 

generated  as  described  in  Section  6.2.  An  examination  of  the  resulting 
characteristic  locus  bounds  indicated  a  range  of  critical  frequencies 
from  <oT  =  40°  to  wT  =  70°  for  which  the  bounds  were  closest  to  [-1,0]. 

4.  Next,  frequency-dependent  bias  bounds  were  generated  for  the 

truncations  specified  in  (1)  (i.e.  for  q  =  )  as  well  as  for  several 

more  severe  truncations  (q  <  qjjest)*  Using  Criterion  6.1,  a  new 
truncation  level  was  selected  for  each  element  to  minimize  the  total 

uncertainty  bound  associated  with  that  element  at  a  preselected  frequency 

in  the  range  [40°,  70°]  and,  hence,  to  improve  the  element-by-element 

uncertainty  description  associated  with  the  multivariable  system. 

Optimal  truncation  levels  of  23,  15,  14,  and  21  were  identified  for 
elements  g^,  g^,  g2l  anc*  g22  resPectively »  and  the  parameter  estimates 
for  these  four  weighting  sequence  models  are  superimposed  on  the  true 
system  parameters  in  Figure  6.4. 

For  the  optimal  truncations  identified  in  (4)  above,  frequency  response 
confidence  bounds  were  developed  using  Theorem  3.3.  A  90%  confidence  level 
was  selected  for  the  multivariable  problem  and,  as  the  test  and 
identification  procedures  generated  independent  parameter  estimates  for  each 
element  of  G(z),  this  specification  imposed  a  requirement  for  97. 4£ 
(.9^^  =  .974)  confidence  bounds  on  each  element.  Using  a  =  .974,  the 
Wilson-Hilfer ty  approximation  was  used  to  define  the  following  frequency 
response  boundaries  for  each  element: 

Agl  (HTVH)  Ag  =  59.15 

In  addition,  parameter  weighting  was  applied  as  described  in  Section  3.4.2 
at  wT  =  65°  to  generate  further  refinements  in  the  element-by-element  bounds 
over  the  range  of  critical  frequencies  of  the  system. 

The  resulting  parameter-weighted  confidence  bounds  were  combined  (as 
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shown  in  Result  6.2)  with  the  frequency-dependent  bias  bounds  associated 
with  the  optimal  truncation  to  produce  a  complete  description  of  frequency 
response  uncertainty  for  each  individual  element.  These  uncertainty  regions 
are  displayed  in  Figure  6.5,  and  maximum  element-by-element  bounds  at 
several  representative  frequencies  are  presented  in  Table  6.5.  The  maximum 
bounds  in  Table  6.5  as  well  as  corresponding  ones  at  other  frequencies  were 
then  used  to  create  the  frequency-dependent  structured  perturbation  matrix 
and  to  generate  structured  E-contours  for  the  perturbed  system.  The 
resulting  characteristic  locus  bounds  are  displayed  in  Figure  6.6.  As 
discussed  in  Section  6.2.2,  these  bounds  identify  the  location  of  the 
perturbed  system  characteristic  loci  based  on  the  given  frequency  response 
uncertainty  description.  Since  a  90%  confidence  level  was  used  to  develop 
this  uncertainty  description,  the  confidence  associated  with  these  bounds  is 
at  least  90%. 

The  E-contour  results  generated  for  this  particular  test  system  suggest 
that  the  frequency  response  uncertainty  associated  with  the  given  nominal 
system  will  have  a  significant  impact  on  the  performance  of  the  closed-loop 
system.  Indeed,  prior  to  the  implementation  of  parameter  weighting  to 
refine  the  element-by-element  description  of  uncertainty,  the  structured  E- 
contour  bounds  for  the  system  included  the  [-1,0]  point,  suggesting  the 
possibility  of  robust  stability  problems.  And  even  though  the  refined  E- 
contours  derived  from  the  parameter-weighted  results  exclude  the  [-1,0] 
point  (demonstrating  that  stability  problems  are  unlikely),  the  close 
proximity  of  these  bounds  to  the  critical  point  clearly  highlights  the  need 
for  additional  compensation  to  increase  the  stability  margins  of  the 
perturbed  system  and,  hence,  to  improve  the  robust  performance  characteris¬ 
tics  of  the  corresponding  closed-loop  system.  From  these  results,  it  is 
clear  that  accurate  uncertainty  information  is  a  vital  element  in  any 
analysis  of  multivariable  systems.  The  procedures  described  in  this  chapter 
provide  the  necessary  tools  to  include  this  information  in  the  analysis. 
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Figure  6.4:  Weighting  Sequence  Parameter  Estimates 
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0. 
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0. 
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0. 
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0. 
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0. 
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0. 

,170 

0. 
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0. 
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0. 
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0. 

.119 

0. 
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0. 

,106 
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0. 

,100 

0. 
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0. 

.097 
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0. 
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0. 

,290 

0. 

,189 

0. 

130 

0.066 

0. 

,035 

0. 

025 

0. 

.019 

Total 

0. 

500 

0. 

,409 

0. 

,289 

0. 

,236 

0.162 

0. 

,135 

0. 

121 

0. 

,116 

Confidence 

0. 

116 

0. 

,110 

0. 

,093 

0. 

,099 

0.095 

0. 

089 

0. 

094 

0. 

.101 

Bias 

0. 

186 

0. 

,180 

0. 

165 

0. 

146 

0.101 

0. 

,059 

0. 

043 

0. 

.033 

Total 

0. 

302 

0. 

,290 

0. 

,258 

0. 

245 

0.196 

0. 

.148 

0. 

137 

0. 

.134 

Confidence 

0. 

163 

0, 

,  144 

0. 

,143 

0. 

,142 

0.129 

0. 

.136 

0. 

128 

0, 

.125 

Bias 

0. 
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0. 

,277 

0. 

,204 

0. 

,159 

0.087 

0. 

.050 

0. 

037 

0. 

.028 

Total 

0. 

481 

0. 

,421 

0. 

,347 

0. 

301 

0.216 

0. 

,186 

0. 

165 

0. 

.153 

Table  6.5:  Maximum  Element-by-Element  Bounds  on  G[exp( jwT) J 


Figure  6.6:  E-Contour  Bounds  for  the  Characteristic  Loci  of  G(z) 
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CHAPTER  SEVEN 


A  UNIFIED  TIME  DOHATN/FREQtfENCY  nOMATN  APPROACH  TO  MUT.TTVARTABI.R  CONTROT. 

The  frequency  response  description  of  uncertainty  developed  in  the 
previous  chapters  quantifies  the  uncertainty  introduced  by  the 
identification  process.  When  combined  with  the  structured  uncertainty 
results  summarized  in  Chapter  2,  this  description  precisely  defines  the 
class  of  systems  within  which  the  true  open-loop  system  exists  and  permits 
an  analysis  of  the  robustness  properties  of  the  corresponding  closed-loop 
system.  The  remaining  task  is,  of  course,  to  design  compensation  which  will 
produce  adequate  closed-loop  response  in  the  presence  of  this  uncertainty. 

One  approach  to  the  design  of  compensators  for  uncertain  systems  is  the 
use  of  frequency-domain  techniques  based  on  the  available  uncertainty 

CO 

information.  Currently  existing  H  design  techniques,  as  discussed  in 
Chapter  1,  exemplify  this  approach.  At  the  present  time  however,  these 
methods  are  not  capable  of  including  a  precise  element-by-element 
description  of  frequency  response  uncertainty  in  the  design  since  knowledge 
of  system  uncertainty  is  used  only  implicitly  to  generate  weighting  matrices 
for  the  specified  optimization  problem.  Indeed,  some  current  studies  (e.g. 
[D0Y31,  [ BIR1 ] )  have  begun  to  address  the  problem  of  incorporating  this 

CO 

uncertainty  information  within  an  H  framework;  though  the  results  generated 
to  date  are  tentative.  An  interesting  and  potentially  useful  frequency- 
domain  alternative  is  suggested  by  the  ability  to  construct  E-contours  from 
precise  element-by-element  frequency  response  uncertainty  bounds.  As 
suggested  in  |DAN2),  manipulation  of  the  characteristic  locus  uncertainty 
bands  associated  with  a  given  system  should  provide  an  effective  approach  to 
robust  compensator  design;  an  approach  that  would  be  greatly  enhanced  by  the 
much  more  accurate  uncertainty  information  that  is  now  available.  Clearly 
however,  additional  research  is  required  to  incorporate  frequency  response 
uncertainty  information  explicitly  in  a  frequency-domain  methodology  to 
produce  robust  control  designs. 
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A  second  approach  to  control  design  in  the  presence  of  uncertainty  (one 
which  will  be  the  focal  point  of  developments  in  this  chapter  and  the  next) 
is  the  use  of  time-domain,  self-tuning  control  techniques  which  combine  on¬ 
line,  computer-implemented  control  algorithms  (to  generate  appropriate 
control  signals)  with  on-line  plant  identification  algorithms  (to  identify 
changes  in  the  system).  As  highlighted  in  Chapter  1,  a  number  of  effective 
self-tuning  control  algorithms  have  been  developed  for  SISO  systems  over  the 
past  decade.  This  success  has  prompted  attempts  by  a  number  of  researchers 
(including  Borisson  [B0R1],  Goodwin  et  al  [G002],  Koivo  [K0I1],  and  Dugard 
et  al  [DUG1])  to  extend  the  concept  of  self-tuning  control  to  multivariable 
systems.  A  majority  of  the  proposed  extensions  have,  however,  relied  on 
controller  designs  which  attempt  to  decouple  the  dynamics  of  the  open-loop 
plant  so  that  SISO  techniques  can  be  applied  directly.  Using  this  approach, 
precise  knowledge  of  the  system  transfer  function  matrix  is  required  to 
eliminate  the  "undesireable"  multivariable  characteristics  of  the  system. 
But  this  precise  knowledge  is  not  generally  available  when  implementing 
adaptive  algorithms  and,  as  a  result,  these  extensions  have  proven  to  be 
effective  only  under  severely  restricted  conditions. 

To  overcome  these  problems,  Mohtadi  et  al  [M0H1]  have  proposed  an 
algorithm  which  extends  the  concept  of  a  f ini te-time-horizon  quadratic  cost 
function  [CLA3]  to  the  multivariable  problem.  This  methodology  reduces  the 
requirement  for  system  knowledge  to  an  upper  bound  on  the  maximum  delay  term 
in  the  system  delay  matrix  and  has  been  shown,  via  simulation,  to  produce 
effective  closed-loop  control.  However,  the  algorithm  relies  on  the 
minimization  of  a  single  scalar  function  of  the  plant  and  controller  outputs 
which  implies  that  many  of  the  special  multivariable  features  of  the  system 
cannot  be  explicitly  catered  for.  For  example,  the  approach  generally  uses 
identical  prediction  and  control  horizons  in  each  channel  for  simplicity  and 
to  avoid  the  problem  of  identifying  "correct"  input/output  pairings. 
Depending  on  the  relative  complexity  of  the  various  channels,  the  effect  of 
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this  may  well  be  unnecessarily  complicated  control  activity  in  certain 
channels.  In  addition,  the  use  of  a  single  scalar  cost  function  implies 
that  control  activity  and  loop-by-loop  performance  (i.e.  the  closed-loop 
response  of  the  output  to  the  jt'1  input)  can  only  be  considered  in  an 
aggregate  sense.  Of  course,  individual  weights  on  different  inputs  and 
outputs  can  be  used  to  control  the  performance  of  individual  loops,  but  the 
correct  choice  of  these  weights  to  achieve  an  acceptable  compromise  between 
performance  and  control  activity  is  by  no  means  obvious.  As  such,  this 
method  handles  the  inherent  problems  of  multivariable  interaction  only  in  an 
indirect  fashion. 

From  past  experience  in  generalizing  SISO  frequency  response  techniques 
to  the  standard  multivariable  control  problem,  it  would  seem  reasonable  to 
anticipate  that  appropriate  multivariable  self-tuners  could  be  realized  by 
embedding  SISO  algorithms  within  a  characteristic  locus  framework.  This 
approach  has,  however,  been  rejected  in  the  past  for  a  number  of  reasons. 
For  example,  a  link  between  the  frequency-domain  characteristics  of  the 
characteristic  locus  method  and  the  time-domain  representations  required  for 
self-tuning  implementations  is  not  readily  apparent.  Furthermore,  control 
designs  based  on  frequency  response  considerations  tend  to  rely,  to  a  large 
extent,  on  engineering  insights  that  cannot  be  conveniently  summarized  in 
implementable  computer  algorithms  for  on-line  applications.  When 
multivariable  weighting  sequence  models  are  considered  however,  it  becomes 
possible  to  establish  an  interesting  and  particularly  useful  connection 
between  time-domain  and  frequency-domain  characteristics.  Indeed,  this 
connection  suggests  a  methodology  for  generalizing  existing  SISO  self  tuning 
control  algorithms  to  multivariable  systems  in  much  the  same  way  that  tin- 
characteristic  locus  method  generalizes  classical  SISO  frequency  response 
techniques.  Hence,  weighting  sequence  descriptions  may  be  used  not  only  in 
the  analysis  of  uncertain  multivariable  systems  but  also  in  the  design  of 
appropriate  compensation  via  on-line  self-tuning  control  algorithms. 
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The  development  of  an  appropriate  multivariable  self-tuning  algorithm 
begins  in  this  chapter  with  an  investigation  of  the  time-domain/frequency- 
domain  relationships  that  can  be  developed  for  multivariable  systems. 
Starting  from  the  "characteristic  sequences"  methodology  proposed  by 
Kouvaritakis  and  Kleftouris  [K0U2],  characteristic  subsystem  models  based  on 
rational  finite-dimensional  z-domain  functions  are  derived  to  establish  a 
bridge  between  the  frequency  domain  and  the  time  domain.  The  properties  of 
this  z-domain  description  are  investigated  in  detail  and  are  shown  to  be 
particularly  amenable  to  implementation  via  convolution.  Thus,  the 
resulting  description  may  be  used  in  conjunction  with  emerging  computer 
capabilities  (such  as  Very  Large  Scale  Integration  (VLSI)  technology  and 
parallel  processing  algorithms)  to  achieve  feasible  on-line  control  algo¬ 
rithms.  Indeed,  as  shown  at  the  end  of  this  chapter,  the  proposed  method¬ 
ology  offers  a  practical  and  potentially  much  more  accurate  alternative  to 
standard  multivariable  frequency  response  control  designs  using  the 
characteristic  locus  method.  More  importantly,  it  provides  the  required 
framework  for  incorporating  SISO  self-tuning  concepts  within  a  multivariable 
generalized-Nyquist  design  philosophy  as  will  be  discussed  in  Chapter  8. 

7.1  Characteristic  Sequences:  A  Time-Domain  Alternative  to  the 

Characteristic  Locus  Description  for  Multivariable  Systems 

The  characteristic  locus  method  (CLM)  described  in  Chapter  2  provides 
a  frequency-domain  approach  to  the  analysis  and  design  of  multivariable 
control  systems.  However,  implementation  of  any  control  algorithm  must 
necessarily  be  accomplished  in  the  time  domain.  This  dichotomy  has  prompted 
some  researchers  to  investigate  alternatives  to  the  CLM  based  solely  on 
time-domain  information.  One  particularly  useful  alternative  is  the 
characteristic  sequences  method  (CSM)  described  below. 

From  Section  2.1,  the  defining  equation  for  the  eigenfunctions  and 
characteristic  directions  of  a  multivariable  system  (written  now  in  terms  of 
z-transforms)  is  given  by: 


G(z)  w^z)  =  g^z)  w.(z). 

After  transforming  this  relationship  into  the  time  domain  using  inverse  z- 
transforms,  the  following  convolution  relationship  can  be  established: 

S«  i  S  S  |  S  /T1\ 

G*w.  =  g.*w.  ....(7.1) 

where  (i)  SG  =  (G(0),  G(l),  •••}  represents  the  weighting  sequence  of  the 
multivariable  plant; 

(ii)  Sg^=  (g.^0),  g^(l),  •••}  is  the  'characteristic  weighting 

sequence'  (CVS)  defined  by  the  characteristic  equation: 

|Sg.  *  SE  -  SG  |  =  S0  ;  i=l,  •••,  m 

with  the  determinant  being  taken  in  the  convolutional  sense  and 
SE=  {I,  0,  0,  •••  };  and 

(iii)  sw^  =  {w^(0),  w^(l),  •••}  is  the  'characteristic  vector  sequence' 
(CVS)  corresponding  to  the  appropriate  CVS  and  defined  by 
{Sg.  *  SE  -  SG)  *  sw.  =  s0  ;  i  =  l,  •••  ,m 
This  transcription  of  the  CLM  to  the  time  domain  was  initially 
investigated  by  Thiga  l THI 1 ]  and  Gough  et  al  [GOUl]  who  relied  on 
convolution  algebra  and  a  Gaussian  elimination-type  algorithm  to  transform 
the  multivariable  weighting  sequence,  G,  into  the  characteristic  sequences 
identified  above,  thus  producing  a  time-domain  description  of  the 

eigenstructure  of  G(z).  It  was  suggested  that  this  eigenstructure 

information  could  be  used  to  study  system  stability  and  performance  and, 
ultimately,  to  design  time-domain  controllers  to  manipulate  these 
characteristic  sequences.  However,  the  proposed  algorithm  for  calculating 
these  sequences  has  been  found  to  be  difficult  to  implement  and,  in  general, 
does  not  admit  a  solution.  Even  when  a  solution  does  exist,  the  stability 
criterion  suggested  in  [THI1]  for  use  with  the  characteristic  sequences  will 
generally  produce  an  invalid  assessment  of  closed-loop  stability. 

To  overcome  these  problems,  Kouvaritakis  and  Kleftouris  [K0U2]  proposed 
an  alternative  approach.  In  particular,  they  developed  an  efficient,  easy- 
to-implement  "deconvolution"  algorithm  for  calculating  the  CVS  and  CVS  when 
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the  eigenvalues  of  G(0)  are  distinct.  [This  standard  algorithm  is 
summarized  in  Appendix  7.1.  A  modified  version  for  the  much  more  unusual 
case  of  repeated  eigenvalues  has  also  been  developed  [CL04]  and  is  described 
in  Appendix  7.2.]  Furthermore,  they  demonstrated  that,  if  SW  is  defined  as 
the  matrix  sequence  whose  columns  are  the  individual  CVS,  then  a  dual 
eigenvector  matrix  sequence  V  can  be  defined  by 

Sv  *  Sv  =  Sv  *  SV  =  SE  ....(7.2) 
and  the  dual  CVS  can  also  be  calculated  using  a  "deconvolution"  algorithm. 
As  a  result,  the  weighting  sequence  description  of  the  plant  can  be 
rewritten  in  terms  of  the  characteristic  sequences  as: 

SC,  =  SW  *  SA  *  SV  _ (7.3) 

s  • 

where  the  elements  of  the  diagonal  matrix  sequence  A  are  the  individual 
CVS  and,  more  importantly,  these  characteristic  sequences  can  be  easily 
computed . 

Kouvaritakis  and  Kleftouris  also  recognized  that  the  CVS  can  be  related 
to  the  characteristic  loci  via  z-transforms.  Indeed,  provided  the  CVS  decay 
to  zero,  they  may  be  truncated  and  used  to  establish  z-domain  expressions 
which  identify  regions  in  the  complex  plane  containing  the  characteristic 
loci  of  the  system.  Thus,  closed-loop  stability  may  be  assessed  via  the 
generalized  Nyquist  criterion,  and  the  only  conservatism  introduced  in  the 
resulting  evaluation  arises  from  the  upper  bounds  used  to  account  for  the 
effects  of  truncation.  For  all  practical  purposes  then,  the  CSM  provides 
the  same  assessment  of  closed-loop  stability  as  the  CLM.  A  rudimentary 
design  approach  based  on  sequence  manipulation  was  also  proposed  in  [K0U2]. 
However,  the  relationships  between  the  time  domain  and  the  frequency  domain 
highlighted  here  suggest  that  a  more  useful  multivariable  design  methodology 
can  be  obtained  by  combining  the  CLM  and  CSM  in  a  more  fundamental  manner. 
The  necessary  relationships  and  the  resulting  design  techniques  are 
developed  and  discussed  in  the  following  sections. 
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7.2  Z-Domain  Characteristic  Subsystem  Descriptions  for  MIMO  Systems 

7.2.1  Z-transform  Relationships  and  Power  Series  Representations 

The  characteristic  locus  theory  summarized  for  continuous-time  systems 
in  Chapter  2  also  applies,  as  mentioned  previously,  to  discrete-time  systems 
using  z-transforms  in  place  of  Laplace  transforms.  Within  this  discrete¬ 
time  context,  the  eigenfunctions  and  corresponding  characteristic  directions 
of  the  multivariable  plant  can  be  directly  related  to  the  characteristic 
sequences  described  above.  To  establish  these  relationships,  consider  the 

z-transform  of  the  multivariable  weighting  sequence  defined  by: 

-  00 

C,(z)  =  Z  G.  z_1  - (7.4) 

i=0  1 

[Note:  Throughout  the  remainder  of  this  chapter  and  the  next,  it  will  be 
necessary  to  distinguish  z-domain  functions  from  the  coefficients  of  their 
corresponding  weighting  sequence  representations.  For  this  purpose,  the 
notation  •  will  be  used  to  denote  z-domain  functions  as  shown  above.] 
Using  this  relationship  in  conjunction  with  the  convolution  property  of  the 

z-transform,  eqn  7.3  may  be  rewritten  in  terms  of  z-transforms  as: 

...  m  .  .  . 

G(z)  =  V(z)  A(z)  V(z)  =  E  g.(z)  w-(z)  v*(z)  - (7.5) 

i  =  l  1  1 

where  g.(z),  w^(z),  and  v^(z)  are  described  by  the  following  power  series: 

g^(z)  =  l  g.(j)  z-^  - (7.6a) 

j=0 

*  CO  00 

w  (z)  =  Z  w.(j)  z“J  vj(z)  =  Z  v|(j)  z“J  - (7.6b,c) 

1  j=0  1  j=0  1 

Eqn  7.5  describes  the  same  relationship  between  the  transfer  function  matrix 
G(z)  and  its  eigenfunctions  and  characteristic  directions  as  that  identified 
by  the  CLM  (e.g.  eqn  2.3  with  s  replaced  by  z).  This  result  implies  that 
the  CVS  and  CVS  simply  define  the  coefficients  associated  with  the  z-domain 
weighting  sequence  representations  for  the  eigenfunctions  and  corresponding 
characteristic  directions  of  G(z). 

The  utility  of  these  z-domain  descriptions  depends  entirely  on  the 
nature  of  the  CVS  and  CVS.  As  suggested  by  Kouvaritakis  and  Kleftouris,  the 
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CVS  can  be  used  to  assess  closed-loop  stability  via  the  generalized  Nyquist 
criterion  provided  the  CVS  decay  to  zero  (i.e.  lim  g.(j)  =  0)  [KOU7].  Tt  is 

j->®  1 

just  as  important  to  recognize  that,  when  the  CVS  and  dual  CVS  possess  this 
same  decay  property,  truncated  versions  of  these  sequences  can  be  used  to 
accurately  approximate  the  characteristic  directions  of  the  plant.  Under 
these  circumstances,  the  CLM  and  CSM  may  be  combined  to  achieve  acceptable 
control  designs  via  convolution  implementations  as  will  be  discussed  later 
in  this  chapter.  Clearly  though,  the  key  to  using  the  CVS  and  CVS  in 
multivariable  control  designs  is  the  asymptotic  decay  of  these  sequences. 
For  this  reason,  it  is  particularly  important  to  investigate  the  convergence 
properties  of  these  sequences  and  to  develop  techniques  which  will  ensure 
that  the  resulting  convergence  conditions  are  satisfied. 

7.2.2  Convergence  Conditions  for  the  CVS 

As  discussed  above,  the  CVS  and  CVS  are  directly  related  to  the 
eigenfunctions  and  characteristic  directions  of  G(z).  It  therefore  seems 
reasonable  to  expect  that  the  theory  of  algebraic  functions  can  be  used  to 
establish  conditions  which  guarantee  the  convergence  of  the  power  series 
representations  in  eqn  7.6  and,  more  importantly,  the  decay  of  the  CVS  and 
CVS.  In  fact,  specific  properties  of  the  algebraic  function  g(z)  can  be 
translated  directly  into  the  desired  convergence  conditions.  Before 
starting  this  investigation  however,  it  is  important  to  recognize  that, 
since  the  characteristic  gain  is  defined  by  the  characteristic  equation 
A(z,g)  =  0  (see  eqn  2.2  with  s  replaced  by  z),  the  properties  of  g  depend  on 
the  characteristics  of  A(z,g).  Furthermore,  A(j,g)  can  be  categorized  in 
one  of  the  following  three  ways: 

Case  1:  A(z,g)  is  irreducible  over  the  field  of  rational  functions  in  z. 

Case  2:  A(z,g)  is  reducible  to  nonlinear  (and  possibly  some  linear) 
factors  in  z  as  suggested  be  eqn  2.1  (with  s  replaced  by  z). 

Case  3:  &(z,g)  is  completely  reducible  to  linear  factors  in  z,  (i.e. 
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m 

A( z,g)  =  II  lg(z)  -  g.(z)J  where  each  g.(z)  is  a  rational 
i  =  l  1  1 

function  of  z). 

Since  the  characteristics  of  A(z,g)  for  Case  2  are  completely  defined  by  the 
characteristics  for  Cases  1  and  3,  an  investigation  of  the  properties  of  g 
need  only  examine  situations  when  A(z,g)  is  irreducible  or  reducible  to 
linear  factors.  Furthermore,  transfer  function  matrices  arising  from 
practical  situations  will,  in  general,  lead  to  characteristic  equations 
which  are  irreducible.  For  this  reason,  attention  will  be  focused  on  Case  1 
throughout  the  remainder  of  the  discussion,  although  relevant  points  related 
to  Case  3  will  be  highlighted  when  appropriate. 

When  A(z,g)  is  irreducible,  several  important  properties  of  the 
functions,  g.(z),  which  describe  the  branches  of  g(z)  can  be  identified. 
One  such  property,  which  is  useful  for  our  purposes,  is  the  following: 

Lemma  7.1:  Let  Cq  denote  the  region  of  the  complex  z-plane  defined  by 

|z|  >  rQ.  Let  A(z,g)  =  det{gl  -  G(z)}  be  irreducible  and  of  degree  m  in 

g(z),  and  suppose  that  g(z)  has  no  poles  or  branch  points  in  Cq.  Then, 

there  are  m  (distinct)  analytic  functions  g^fz),  ...,  gm(z)  defined  in  Cq 
which  satisfy  the  equation  A(z,g)  =  0. 

Proof:  See  Lemma  23.1  in  l SMI 1  ] . 

The  conditions  established  above  for  the  analyticity  of  the  g^(z)  can  also 
be  used  to  produce  conditions  for  the  decay  of  the  CVS  as  shown  in  the 
following  theorem: 

Theorem  7.1:  Let  C^  denote  the  region  of  the  complex  z-plane  defined  by 

|z  |  >  1.  Let  A(z,g)  be  irreducible  and  of  degree  m  in  g(z),  and  suppose 

that  g(z)  has  no  poles  or  branch  points  in  Cj.  Then,  the  elements  of  each 

CVS  must  decay  to  zero  (i.e.  lim  g.(j)  =  0;  i=l,  •••  ,m). 

j-*”  1 

Proof:  If  g(z)  has  no  poles  or  branch  points  in  C^,  Lemma  7.1  indicates 

that  the  branches  of  g(z)  are  analytic  for  all  |z |  >1.  By  Taylor's 
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theorem,  each  g^(z)  can  be  expanded  in  a  power  series  of  the  form: 


....(7.7) 

and  this  series  must  converge  for  all  |z |  >1.  In  particular,  it  must 


g.(z)  =  £  g,(j)  z' 

j=0 


converge  for  z  =  1.  Thus,  the  infinite  sum  g.(l)  =  £  g.(j)  must 

1  3=0  1 

converge,  but  this  can  only  happen  if  lim  g.(j)  =  0. 

j-»®  1  . . .  .QED 


The  asymptotic  decay  implied  by  Theorem  7.1  guarantees  that  the  CVS  can  be 
truncated  to  produce  sufficiently  accurate  z-domain  representations  for  the 
branches  of  the  characteristic  gain,  g(z).  Furthermore,  these  power  series 
representations  can  be  examined  to  assess  the  effects  of  using  finite-term 

approximations.  For  instance,  eqn  7.6a  can  be  written  as: 

q-1 

g.(z)  =  £  g.(j)  z  J  +  6g(z); 

3=0 

and,  if  iQ  denotes  the  largest  distance  from  the  origin  to  a  pole  or  branch 

point  of  g(z),  it  can  be  shown  (using  standard  relationships,  e.g.  [CHU1]) 
that  the  error  in  the  characteristic  locus  generated  by  truncating  the  CWS 
at  q  elements  is  bounded  by: 

|Sg(eJ“T)|  ,  inf  sup  ....,7.8) 

for  r  in  the  range  r^  <  r  <  1  and  wT  in  the  range  0  <  «T  <  n.  In  essence, 
this  result  provides  a  means  of  translating  desired  characteristic  locus 
accuracy  into  appropriate  CWS  truncation  levels. 


7.2.3  Convergence  Conditions  for  the  CVS 

Theorem  7.1  establishes  appropriate  conditions  for  the  decay  of  the 
CWS.  These  same  conditions  will  also  guarantee  the  decay  of  the  CVS  and 
dual  CVS.  The  first  results  which  must  be  produced  to  verify  this 
observation  are  given  by  Lemma  7.2  and  Theorem  7.2  which  identify  special 
properties  of  the  repeated  zeros  of  A(z, g)  and  the  branch  points  of  g(z), 
respectively.  These  results  are  then  combined  with  standard  power  series 
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relationships  to  establish  the  desired  conditions  for  CVS  convergence. 

To  begin,  the  following  property  of  the  repeated  zeros  of  A(z,g)  can  be 
identified: 


Lenraa  7.2:  Let  A(z,g)  be  irreducible,  and  let  Zq  represent  a  point  in  the 

complex  plane  which  is  not  a  pole  of  g(z)  and  for  which  is  a  zero  of 

A(zrt,g)  with  multiplicity  k  .  Then,  dg/dzl.  "  .  is  finite  if,  and  only 
0’*'  P  3  g  6  '(ZQ.gg) 

if,  zn  is  a  zero  of  A(z,gn)  with  multiplicity  k  >  k  . 
u  u  z  g 


Proof:  When  g^  is  a  zero  of  A(Zg,g)  with  multiplicity  k  ,  the  following 


conditions  will  be  satisfied  for  all  k  <  k  and  1  <  j  <  k: 

g  - 


3kA 

3gk 


(ZQ.gQ) 


=  0 


3kA 


3zk-j  3gj 


(Zq-Sq) 


=  0  . ... (7.9a, b) 


Similarly  when  z^  is  a  zero  of  A(z,gg)  with  multiplicity  kz,  the  following 

conditions  will  be  satisfied  for  all  k  <  k  and  1  <  j  <  k: 

z  -  J  - 


3kA 


=  0 


3kA 


(z0,g0) 


=0  .... (7.10a,b) 


32^  '(z0,g0)  3zJ  3gk 

Using  these  relationships,  the  lemma  can  be  proved  in  two  parts  as  shown 
below. 

Sufficiency:  Consider  the  kL  total  differential  of  A(z,g)  defined  by: 


'(A)  =  { 


3z 


dz 


-i-dg}k 


....(7.11) 


9g 


Now,  if  eqns  7.9a,b  and  7.10a,b  are  satisfied  for  all  k  <  k^  and 


1  <  j  <  k,  then  d  ®(A)  s  0.  So  if  kz  >  kg,  the  following  condition  must 
be  satisfied: 


k 

3  gA 

k 

3  gA 

dg 

— ir 

,  3z  g 

g  v1  * 

3z  8  3g 

.  dz  . 

i  1/ 

k 

3  g  A 

d£ 

k  -1 
g 

+  K. 

g 

.k  -1 

3z  3g  8 

.  dz  . 

3  gA 


3g 


d£ 

k  1 
g 

.  dz  . 

(z0,g0) 


=  0  - (7.12) 


Since  A(z,g)  is  a  polynomial  in  z  ang  g,  all  of  the  partial  derivative 
terms  in  eqn  7.12  are  either  zero  or  finite.  Furthermore,  by  assumption, 


k  . 
g 


3  SA/  9  Bg|,  «  *  0.  Hence,  eqn  7.12  can  only  be  satisfied  if 
vz0’g0; 

dg/dz|,  *  is  finite. 

'z0’g0; 

Necessity:  Assume  that  dg/dz|,_  1  v  is  finite  and  k_  >  1.  Since 

lz0>go'  g 


A(z,g)  =  0,  its  first  total  differential  must  be  zero.  Thus, 


|  3A  9A  dg  >,  |  . 

\  3z  3g  dz  /|(z0,g0) 


..(7.13) 


k  ~  Ic  i 

By  assumption  3  A/ 3g  | .  .  =  0  for  all  k  <  k  .  So  eqn  7.13  can  only  be 

'‘z0,g0;  g 

satisfied  if  3A/3z|,  .  =  0.  But  this  result  implies  that  dz(A)  s  0; 

'z0’gCr 

and,  if  k  >2,  this  condition  can  only  be  satisfied  (under  the  given 
8 

2  2  3 

assumptions)  if  3  A/3z  |.  .  =  0.  But,  this  implies  that  d'tA)  a  0. 

u0’g0; 

k  k  i 

The  same  arguments  hold  for  all  k  <  k  and,  hence,  3  A/3z  \.  .  =  0  for 

-til  k  <  k^.  Thus,  Zq  must  be  a  zero  of  A(z,g^)  with  multiplicity  k^  >  k^. 


g 


z  -  g 
QED 


With  Lemma  7.2  in  place,  the  following  special  property  of  the  branch  points 
of  g(z)  can  now  be  identified: 


Theorem  7.2:  Let  A(z,g)  be  irreducible,  and  let  z^  represent  a  point  in 
the  complex  plane  which  is  not  a  pole  of  g(z)  and  for  which  g^  is  a 
repeated  zero  of  A(zQ,g)  with  multiplicity  kg.  Then  z^  is  a  branch  point 
if,  and  only  if,  g^  is  a  repeated  eigenvalue  of  G(z^)  associated  with  a 
nonsimple  Jordan  form. 


Proof:  Since  Zq  is  not  a  pole  of  g(z),  the  tiansfer  function  matrix  G(z) 

may  be  rewritten  in  Taylor  series  form  as: 


G(z)  =■  G(z„)  +  l  Gn(i)  (z 
u  i  =  l  u 

.  th 


z0> 


....(7. VO 


(where  Gg(i)  denotes  the  i  coefficient  of  the  Taylor  series  expansion 
about  the  point  Zq)  and  this  series  must  converge  for  all  z  in  a  neighbor¬ 
hood  of  Zq.  The  proof  of  the  theorem  may  now  be  separated  into  two  parts. 
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Necessity:  Assume  first  that  is  a  branch  point  of  g(z).  Also 
assume,  for  the  moment,  that  the  eigenvalue/eigenvector  decomposition  of 
G(zn)  can  be  written  in  simple  Jordan  form  (i.e. 


I  0 


«V  -  "<V  A(z0>  v<20>  -  (“k  1  Vki 


A  .  V 

m-k  J  L  m-k 


....(7.15) 


where  =  diag  [gQ,  •••,  g());  Am_k  =  diag  (gk+1,  •••,«„);  and  Wfc,  Wm  R, 
*T  *T 

VR,  and  Vm_R  are  matrices  of  appropriate  dimensions  containing  the 
corresponding  eigenvectors  and  dual  eigenvectors  of  G(Zq)  respectively). 
Using  this  decomposition,  eqn  7.14  can  be  rewritten  as: 

CO 

G(z)  =  V(z0)  {a(zq)  +  MV(z0)  G0(i)  W(Zq)}  (z  -  Zq)1}  V(zq)  ....(7.16) 

and  the  characteristic  equation  (detlgl  -  G)  =  0  at  g  =  g^)  becomes: 

A(z,g0)  =  |W(z0)l  |H(z) j  |V(z0)|  =  0  ....(7.17) 

CO  ^ 

where  M(z)  =  A*(z  )  -  E  {V(zQ)  GQ(i)  W(zQ))  (z  -  z^1 

i  =  l  '  ,J 


A  (*0>  - 


0  I  g0I  - 


Alternatively,  M(z)  can  be  rewritten  in  the  following  form: 


(7.18) 


E  vk  Go(i>  Vk  (z-zo> 

i  =  l 


-  E  VT  .  Gn(i)V.  (z-Zn)1  I  (g„I-A  ,)-  E  VT  ,G„(i)W  .  (z-z,,)1 

.  ..  m-k  0  k'  0  60  m-k'  .  ,  m-k  0V  '  m-kv  0 

1=1  1  i=l 

Since  |V(zQ)|  |V(zQ)  |  =  1,  eqn  7.17  reduces  to  A(z,gQ)  =  |M(z)  |  =  0.  Now, 

Schur's  formula  for  the  determinant  of  partitioned  matrices,  given  by 

x  I  Y  , 

det  -  -  -  -  =  det  X  det{W  - -Z  X  Y)  , 

.  z  |  y  . 

can  be  used  to  identify  |M(z)|.  But,  (z  -  z^)  is  a  common  factor  in  all 

CO 

'J*  "  j 

of  the  elements  of  X  =  -  E  VrGq( i )WR(z-z^)  .  Hence,  Zq  must  necessarily 
be  a  zero  of  A(z,gn)  with  multiplicity  at  least  equal  to  k  .  By  Lemma 
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7.2,  this  implies  that  dg/dz|,  *  is  finite  and,  hence,  that  zn  is  not 

(z0’e0}  U 

a  branch  point.  This  result,  however,  contradicts  the  initial  assumption 
and,  therefore,  g q  must  be  a  repeated  eigenvalue  of  G(Zq)  associated  with 
a  nonsimple  Jordan  form. 

Sufficiency;  Assume  that  gg  is  a  repeated  eigenvalue  of  G(z^) 

associated  with  a  nonsimple  Jordan  form.  Provided  that  the  definition  of 

W(Zq)  is  extended  to  include  (wherever  appropriate)  pseudo-eigenvectors, 

eqn  7.17  can  still  be  used  to  investigate  the  properties  of  A(z,gQ>. 

*  ^ 

However  now,  the  upper  left  block  of  A  (z^)  (eqn  7.18)  will  no  longer  be 
zero  because  of  the  unity  elements  in  the  off-diagonal  positions  of  the 
Jordan  form  representation  for  the  eigenvalues  of  G(Zq).  In  this 
situation,  (z  -  z^)  is  no  longer  a  common  factor  in  the  upper  left  block 
of  M(z)  and,  hence,  Zq  cannot  be  a  zero  of  A(z,g^)  with  multiplicity 
greater  than  or  equal  to  k  .  By  Lemma  7.2  then,  dg/dz | ,  ‘  .  is 

g  uo,gcr 

unbounded  and,  hence,  zQ  is  a  branch  point  of  g(z).  qED 


Theorem  7.2  identifies  an  alternative  definition  for  the  branch  points  of 
the  algebraic  function  g(z).  More  importantly,  when  combined  with  Theorem 
7.1,  it  provides  a  means  of  establishing  the  convergence  properties  of  the 
CVS  and  dual  CVS.  The  appropriate  result  can  be  developed  in  three  stages 
as  shown  here: 


Definition  7.1;  (CHU1]  The  Cauchy  product  of  the  two  infinite  power 


f ( 1 )  =  E  a.  z  k 
k=l  K 


g(z  *)  =  E  b.  z  k 
k  =  l  K 


is  given  by: 


h(z  )  =  E 


where  the  coefficients,  c^,  are  defined  by: 


E  a .  b.  . 
i=0  1  k'1 
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Lemma  7.3:  The  Cauchy  product  of  two  convergent  power  series  converges  to 
the  product  of  these  series  at  all  points  interior  to  their  circles  of 


convergence. 

Proof:  See  p  164  in  [CHU1]. 


Theorem  7.3:  If  G(z)  is  stable,  A(z,g)  is  irreducible  and  of  degree  m  in 
g(z),  and  g(z)  has  no  branch  points  in  C^,  then  each  of  the  m  CVS  and  m 
dual  CVS  must  decay  to  zero. 


Proof:  By  assumption,  g(z)  has  no  poles  or  branch  points  in  Cj.  Hence  by 
Theorem  7.1,  each  eigenfunction,  g.(z),  can  be  written  as  a  power  series 
in  z  1  which  converges  for  all  z  in  C..  Furthermore,  by  Theorem  7.2,  W(z) 
is  full  rank  at  each  and  every  point  in  C^.  As  a  result,  there  exists  a 
complete  set  of  characteristic  directions,  (w^(z);  i  =  l,***,m},  each  of 
which  can  be  defined  (for  all  z  in  C^)  by  a  relationship  of  the  form: 


{g.(z)I  -  G ( z ) }  w.(z)  =  0 

where  gj(z)  is  the  corresponding  eigenfunction  of  G(z). 

The  result  above  implies  that  each  one  of  the  m  characteristic 
directions  can  be  generated  completely  in  terms  of  products,  sums,  and/or 
differences  of  the  corresponding  eigenfunction  and  the  various  elements, 


g^(z),  of  G(z).  But  from  above,  each  g.(z)  can  be  written  as  a  power 
series  in  z  ^  which  converges  for  all  z  in  and,  since  G(z)  is  stable, 
the  same  is  true  for  each  g..(z).  Hence,  w.(x)  can  be  written  entirely  in 
terms  of  Cauchy  products  and/or  sums  of  power  series.  Lemma  7.3  and  other 


standard  power  series  results  can,  therefore,  be  used  to  demonstrate  that 

the  power  series  representation  for  v.(z)  must  converge  for  all  z  in  Cj. 

As  a  result,  w.(z)  can  be  written  as  a  power  series  in  z  *  which  converges 

for  all  |  z  |  >  1  and,  in  particular,  for  z  =  1.  But,  this  can  only  happen 

if  lim  w.(j)  =  0.  Hence  the  elements  of  each  of  the  m  individual  CVS  must 
j-»”  1 

decay  to  zero. 
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Identical  arguments  can  be  used  to  prove  the  same  result  for  each  of 


the  m  dual  CVS  since  the  dual  characteristic  directions  are  defined  by: 


v^z)  {gp(z)I  -  G(z)} 


.... QED 


The  fact  that  w^(j)  and  Vp(j)  decay  to  zero  guarantees  that  the  CVS  and 
dual  CVS  can  be  truncated  to  produce  sufficiently  accurate  descriptions  of 
the  characteristic  directions  of  G(z).  However  because  w^(z)  and  v^(z)  are 
not  unique,  appropriate  truncation  levels  cannot  be  identified  in  the  same 
manner  as  that  shown  previously  for  the  CVS.  Instead,  the  number  of  terms 
in  SW  and  JV  needed  to  achieve  an  appropriate  level  of  accuracy  may  be 
estimated  by  relating  W(z),  A(z)  and  V(z)  to  G(z).  In  particular,  using  the 
following  representations: 

N-l  ....  .  q-1  ... 

G(z)  =  Z  G ( j )  z  J+  6G  =  GT  +  SG  V(z)  =  Z  V(j)  z  J+  6V  =  Vp  +  SV 

j=0  j=l 

.  n-l  _.  .  .  q-1  ... 

A(z)  =  Z  A(j)  2  J+  SA  =  A^  +  $A  V(z)  =  Z  V( j )  z'J+  SV  =  V  +  6V 

j-0  j=l 

(where  n,  N,  and  q  are  selected  so  that  |6G|,  | SA| ,  |SW|,  and  |SV|  are  0(e) 

in  size  for  all  |z|  =  1),  G(z)  can  be  rewritten  as: 

G  =  GT  +  SG  =  VT  AT  VT  +  SV  AT  VT  +  VT  6A  VT  +  VT  AT  SV  +  0(e2) 

2 

By  eliminating  all  terms  of  0(e  ),  this  relationship  reduces  to: 

Vp  Ap  Vp  =  Gt  -  {SV  Ap  VT  +  VT  SA  VT  +  VT  Ap  SV)  ....(7.19) 

Furthermore,  for  systems  where  G(z)  is  not  poorly  conditioned  (i.e.  the 
eigenfunctions  of  G  are  not  widely  disparate  and  the  eigenvectors  of  G  are 
not  excessively  skew),  the  three  perturbation  terms  in  eqn  7.19  will  be  0(e) 
in  size  and  can  be  neglected  when  compared  to  Gp.  In  this  case,  the  number 

of  terms  in  the  series  representation  of  Vp  Ap  Vp  must  equal  the  number  of 
terms  in  the  series  representation  of  Gp  and,  under  the  assumption  that  the 
lengths  of  Vp  and  Vp  are  equal,  this  implies  that  n+2q-2  =  N.  So,  Vp  and  Vp 

must  each  contain  q  =  (N-n)/2  +  1  terms.  [Notice  that,  when  the  plant  is 

perfectly  conditioned  (i.e.  G(z)  =  g(z)I),  N  equals  n  and  the  above  analysis 
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implies  that  q  =  1  which  agrees  with  the  fact  that  V(z)  =  V(z)  -  I.)  If, 

however,  G(z)  is  poorly  conditioned,  the  perturbation  terms  in  eqn  7.19  will 

generate  an  additional  Q  significant  terms.  Hence,  Vj  and  must  now 

contain  q  =  (N-n+Q)/2  t  1  terms  each.  Although  it  is  not  possible  to  obtain 

a  general  expression  for  Q,  the  number  of  extra  sequence  elements  generated 

by  these  error  terms  will  tend  to  be  small  for  all  but  the  most  poorly 

s  s  t 

conditioned  plants.  Thus,  the  truncation  levels  for  and  can  be 
expected  to  be  significantly  smaller  than  either  N  or  n,  and  this  conclusion 
agrees  with  observations  for  a  large  number  of  systems. 

7.2.4  Extensions  for  Special  Situations 

The  results  established  in  Sections  7.2.2  and  7.2.3  suggest  that,  for 
the  generic  case  when  d(z,g)  is  irreducible,  the  CVS,  CVS  and  dual  CVS  can 
be  used  to  generate  accurate  finite-series  representations  for  gj(z),  Wj(z), 
and  v^(z)  provided  tue  open-loop  system  is  stable  and  g(z)  has  no  unstable 
branch  points.  In  fact,  these  results  may  be  extended  to  unstable  systems 
by  pulling  out  the  unstable  poles  of  G(z)  as  common  factors  to  produce: 

G(z)  =  -J -  S(z)  - (7.20) 

du(z) 

where  S(z)  is  a  stable  transfer  function  matrix  and  du(z)  is  a  scalar 
function  containing  the  unstable  poles  of  G(z).  The  convergence  conditions 
of  Theorems  7.1  and  7.3  can  now  be  applied  to  S(z)  and,  since  this  transfer 
function  matrix  is  stable,  its  CVS  and  CVS  will  decay  provided  5(z)  has  no 
unstable  branch  points.  Hence,  the  eigenfunctions  and  charac ter i st i r 
directions  of  S(z)  can  be  accurately  described  by  finite  dimensional  z 
domain  representations  as  discussed  previously.  But  from  eqn  7.20,  the 
characteristic  directions  of  S(z)  and  G(z)  ate  the  same.  Furtheimore,  the 
respective  eigenfunctions,  s,(z)  and  gj(z),  aie  telated  by: 


As  a  result,  accurate  finite-dimensional  z  domain  representations  for  the 
eigenfunctions  and  charac t er i s t i c  directions  of  unstable  systems  can  also  he 
generated  provided  no  unstable  branch  points  exist. 

The  results  of  the  previous  sections  can  also  be  extended  to  the  much 
more  unusual  case  when  A(z,g)  is  reducible  to  linear  factors.  In  this 
situation,  there  are  no  branch  points.  Hence,  the  CVS  must  necessarily 
decay.  At  first  glance,  this  observation  also  seems  to  imply  that  the  CVS 
and  dual  CVS  will  also  decay.  However  unlike  the  previous  situation  where 
repeated  eigenvalues  associated  with  a  nonsimple  Jordan  form  could  only 
occur  at  branch  points,  it  is  now  possible  for  a  nonsimple  Jordan  form  to 
exist  at  any  point  z ^  where  G(z(j)  has  repeated  eigenvalues.  If  this  occurs, 
V(z)  will  lose  rank  as  z  -*  z^  and,  in  the  limit,  V(z)  =  V  ^ ( z )  will  be 
unbounded.  Fot  |z()|  >  1,  this  suggests  that  V(z)  cannot  be  written  as  a 

powei  series  in  z  1  with  decaying  coefficients.  Hence,  the  resulting  CSM 
representation  of  the  system  will  be  unsuitable  for  control  design  purposes. 

For  this  problem  however,  the  CSM  algorithm  may  be  modified  to  produce 
decaying  sequences.  Preliminary  attempts  by  Li  and  Cameron  [LI1J  have 
produced  modifications  fot  the  extremely  unlikely  situation  when  A(z,g)  has 
repeated  linear  factors  associated  with  a  nonsimple  Jordan  form  at  all 
points  in  the  complex  plane,  a  situation  which  implies  that  G(z)  is 
identically  similar  to  J(z)  (i.e.  G(z)  s  V(z)  J(z)  V(z)  for  all  z,  where 
J(z)  denotes  the  appropriate  Jordan  form).  But,  they  have  not  addressed  the 
less  unlikely  situation  where  G(z)  V(z)  J(z)  V(z)  at  isolated  values  of  z. 
Modifications  to  handle  this  situation  are  developed  in  Appendix  7.2. 

Again,  it  must  be  stressed  that  A(z,g)  will,  generically,  b' 
i  t  i  edu<  ible.  Hence,  the  developments  foi  the  case  when  A(z,g)  is  reducible 
tf>  linear  factors  aie  presented  primarily  for  completeness.  More 
importantly,  the  fact  that  A(z,g)  is  irreducible  suggests  that,  for  all 
practical  control  problems,  the  convergence  conditions  of  Theorems  7.1  and 
'.1  must  be  satisfied  to  obtain  suitable  finite  dimensional  characteristic 

1  jfl 


subsystem  descriptions.  Unstable  branch  points  must,  therefore,  be  avoided 
to  permit  the  use  of  these  subsystem  descriptions  in  control  designs.  The 
problems  of  recognizing  and  eliminating  unstable  branch  points  are  examined 
in  greater  detail  in  the  following  section. 

7.3  An  Investigation  of  Unstable  Branch  Points 

7.3.1  Recognizing  Unstable  Branch  Points 

Closer  examination  of  the  CSM  algorithm  indicates  that  the  initial 
element  of  the  plant  weighting  sequence,  G(0),  is  a  key  component  in  the 
calculation  of  the  CVS  and  CVS.  Indeed  once  the  eigenvalues  and 
eigenvectors  of  G(0)  are  identified,  the  remaining  sequence  elements  can  be 
readily  computed  using  simple  algebraic  operations.  Thus,  the  eigenvalues 
and  eigenvectors  of  G(0)  have  a  particularly  significant  effect  on  all 
elements  of  the  CVS  and  CVS,  and  one  should  therefore  anticipate  that  these 
quantities  can  be  directly  related  to  the  convergence  characteristics  of  the 
CVS  and  CVS.  In  fact,  several  conditions  based  on  the  eigenvalues  of  G(0) 
can  be  established  to  identify  the  existence  of  unstable  branch  points  as 
shown  below. 

It  is  known  [SMI1]  that  the  branch  points  of  g(z)  are  a  subset  of  the 

zeros  of  the  discriminant  polynomial,  Dg(z),  associated  with  A(z,g)  =  0. 

[Note:  For  a  2  x  2  system,  the  characteristic  equation  is 

*2  * 

A(z,g)  =  a(z)  g  +  b(z )  g  +  c ( z )  =  0 

•  *2 

and  D  (z)  =  b  (z)  -  4  a(z)  c(z).  A  precise  definition  of  D  for  the  general 
8  8 

case  can  be  found,  for  example,  in  [ BLI 1 ] . ]  A  detailed  investigation  of  Dg 
should,  therefore,  provide  insight  into  the  specified  problem.  To  begin, 
G(z)  can  be  written  in  power  series  form  as  shown  in  eqn  7.4  and  the  coeffi¬ 
cients  G^  will  decay  to  zero.  [Note:  For  all  of  the  following 
developments,  G(z)  will  be  assumed  to  be  stable.  If  the  open-loop  system  is 
unstable,  the  following  results  will  hold  for  S(z)  (defined  by  eqn  7.20) 
and,  hence,  they  can  also  be  extended  to  unstable  systems.)  Using  this 
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....(7.21) 


power  series  representation,  A(z,g)  can  be  rewritten  as: 

^  CD 

A(z,g)  =  I  g(z)  I  -  l  G ( i )  z-i|  =  0  ; 

i=0 

and  so,  the  largest  power  of  z  in  the  discriminant  polynomial,  D  (z),  must 

g 

necessarily  be  z® .  Furthermore,  the  coefficient  of  the  z^-term  can  be 

identified  simply  by  setting  z  ^  =  0.  When  this  is  done,  eqn  7.21  reduces 
to  the  characteristic  equation  defining  the  eigenvalues  of  G(0)  (i.e. 
|gl  -  G(0)|  =  0).  Hence,  the  coefficient  of  the  z^-term  in  is  equal  to 
the  discriminant,  Dq,  of  the  polynomial  defined  by  |gl  -  G(0) |  =  0.  With 
this  in  mind,  the  following  result  can  be  developed: 

Lemma  7.4:  Let  A(z,g)  be  irreducible  and  of  degree  m  in  g(z).  Assume 

G(0)  has  distinct  eigenvalues,  and  let  p^,  p^,  and  p  ^  denote  the  number 
of  pairs  of  complex  conjugate  eigenvalues  of  G(0),  G(l),  and  G(-l) 

respectively.  Then  if  p^  is  odd  (even)  and  p^  is  even  (odd),  there  is  at 
least  one  zero  of  Dg  in  C^.  Similarly  if  pQ  is  odd  (even)  and  p_j  is  even 
(odd),  there  is  at  least  one  zero  of  in  Cj . 

Proof:  Since  Dn  is  the  coefficient  of  z°  in  D  ,  D  can  be  written  as: 

0  g  g 

Vz)  •  2~k  K  *k  *  £  >  ••••<7'22) 

where  k  is  always  even  and  Yj  are  appropriately-determined  coefficients. 

Furthermore,  Dq  *  0  since  G(0)  has  distinct  eigenvalues.  So  the  existence 

of  zeros  of  in  Cj  will  be  a  function  of  the  sign  of  Dq. 

Consider  first  the  case  where  Pq  is  even.  Since  Dq  is  the  discriminant 

of  the  polynomial  defined  by  jgl  -  G(0)|  =  0,  Dq  is  positive  if  Pq  is  even 

( BARI ] .  Hence,  by  Jury's  stability  criterion  [SAUlj,  will  have  a  zero 

in  C.  if  D  (1)  is  negative.  But  D  (1)  is  negative  if,  and  only  if,  p.  is 
■t  g  g  t 

odd  [BAR1J.  Alternatively  when  Pq  is  odd,  Dq  is  negative  and  Jury's 

criterion  implies  that  has  a  zero  in  Cj  if  - ( 1 >  is  negative  (i.e. 

D  (1)  >  0).  But  D  (1)  is  positive  if,  and  only  if,  p.  is  even. 

S  o  ^ 

Since  k  (in  eqn  7.22)  is  always  even,  similar  arguments  can  be  used 
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to  construct  the  stated  conditions  for  G(-l). 


. . . . QED 


The  conditions  presented  in  Lemma  7.4  can  be  used  to  identify  the 
existence  of  repeated  values  of  g(z)  for  some  z  in  C^.  Although  branch 
points  of  g  are  (by  Lemma  7.2  and  Theorem  7.2)  only  a  subset  of  the  repeated 
values  of  g,  for  all  practical  situations  the  two  will  be  the  same.  As 
such,  Lemma  7.4  highlights  conditions  which  guarantee  the  existence  of 
unstable  branch  points.  [Note:  Although  this  observation  appears  to 
diminish  the  importance  of  Lemma  7.2  and  Theorem  7.2,  it  will  be  shown  in 
the  next  section  that  these  two  earlier  results  can  be  used  to  predict 
situations  in  which  the  existence  of  unstable  branch  points  is  irrelevant.) 
Furthermore,  the  conditions  of  Lemma  7.4  can  be  refined  as  follows: 

Theorem  7.4:  Let  A(z,g)  be  irreducible  and  of  degree  m  in  g(z),  and  let 
Pq,  p^,  and  p  ^  denote  the  number  of  pairs  of  complex  conjugate 
eigenvalues  of  G(0),  G(l),  and  G(-l)  respectively.  Furthermore,  assume 
that  the  real  eigenvalues  of  G(l)  and  G(-l)  are  all  distinct.  Then,  it 
Pq  *  P}  or  Pq  *  P_[i  unstable  branch  points  must  exist. 

Proof:  For  the  moment,  assume  that  g  has  no  blanch  points  in  C^.  Then  by 
Theorem  7.1,  the  elements  of  each  CVS  must  decay,  and  these  elements  can 
be  related  to  g^(z)  by  eqn  7.6a.  Since  g-(z)  is  a  branch  of  the 
characteristic  gain  function,  g . ( 1 )  is  an  eigenvalue  of  G(l).  As  a 
result,  each  eigenvalue  of  G(l)  can  be  rewritten  in  the  following  wav: 

on 

g.(l)  =  l  g.(j)  ,...(7.?t) 
)=<> 

First,  consider  the  case  p((  <  p^.  This  condition  implies  that  theie 
are  more  complex  eigenvalues  of  G( 1 )  than  their  ate  complex  eigenvalues  of 
G(0).  But  a  review  of  the  standard  CSM  algoiithm  in  Appendix  7.1  suggests 
that  the  number  of  complex  eigenvalues  of  t,((i)  is  ptecisely  the  same  as 
the  number  of  complex  CVS.  As  a  result,  t he-  telationship  imposed  by  the 
condition  p^  <  pj  suggests  that  a  complex  numhet  (one  of  the  eigenvalues 

1 A 1 


of  G(l))  is  identically  equal  to  an  infinite  sum  of  real  numbers.  Clearly 
this  cannot  be  the  case.  So,  the  initial  assumption  that  g(z)  has  no 
unstable  branch  points  cannot  be  valid. 

Next  consider  the  case  Pq  >  p^  (i.e.  more  complex  CVS  than  complex 
eigenvalues  of  G(l)).  In  this  case,  eqn  7.23  can  only  be  satisfied  if 

03 

E  Imag{g.(j)}  =  0  for  some  i.  But,  the  standard  CSM  algorithm  also 
j=0  1 

suggests  that  complex  CWS  can  only  exist  in  complex  conjugate  pairs.  So, 

CO 

E  Imag{g.(j)}  =  0  necessarily  implies  that  G(l)  will  have  repeated  real 
j=0 

eigenvalues.  But  this  result  violates  the  assumption  that  the  eigenvalues 
of  G(l)  are  distinct.  Thus  the  initial  assumption  that  g(z)  has  no 

unstable  branch  points  cannot  be  valid.  Hence,  unstable  branch  points 

must  exist  when  p^  *  p^ . 

Similar  arguments  will  produce  a  corresponding  result  for  z  =  -1  since 

«  CD 

g,<-l)  =  *  g,(j)  (-Dj 

1  j=0  ....QED 

It  must  be  pointed  out  that  the  conditions  established  by  Theorem  7.4  are 

only  sufficient.  If  p(^  =  pj  -  p  j,  unstable  branch  points  may  still  exist. 

However  if  either  of  the  conditions,  p()  *  p1  or  p()  *  p  ^  ,  is  satisfied, 
unstable  branch  points  must  exist.  As  a  result,  a  quick  assessment  of 
potential  convergence  problems  can  be  made  prior  to  implementing  the  CSM 
a  Igor i thm . 

One  further  condition  foi  the  existence'  ot  unstable  branch  points  may 
be  established  by  recognizing  that  the  eigenvalues  of  (1(0)  ate  the  eigen 
values  ot  C(z)  evaluated  at  z  '  o.  This  lead'  to  the  following  result: 

Theore*  / .  b :  lor  the  <  as*'  when  A(..g)  r  t  r  t  <*du<  thle ,  /  '  "will  he  a 

htanch  point  of  g(z)  if  and  only  if  «.('*>  has  repeated  eigenvalues 

asso<  iated  with  a  nonsimple  Jordan  form. 


If.,’ 


Proof:  Since  the  eigenvalues  of  G(0)  are  also  the  eigenvalues  of  G(z)  for 
z  0,  this  result  follows  immediately  from  Theorem  7.2. 

. . . . QED 

7.3.2  Coping  with  Unstable  Branch  Points 

Theorems  7.4  and  7.5  provide  a  means  of  recognizing  many  situations 
when  the  sequences  generated  by  the  standard  CSM  algorithm  will  not  decay. 
For  these  cases,  it  may  not  be  possible  to  obtain  accurate,  finite¬ 
dimensional  z-domain  representations  for  g^,  w^,  and  v^  via  truncation  of 
the  CWS  and  CVS.  However,  the  requirement  for  stable  branch  points  is  not 
as  restrictive  as  it  may  first  appear.  indeed  in  many  instances, 
appropriate  finite-sequence  representations  may  still  be  obtained  in  a 
straightforward  manner.  The  procedures  required  to  accomplish  this  task  can 
be  divided  into  three  categories,  each  of  which  will  be  addressed  separately 
in  the  following  discussion. 

Case  1:  Latent  ("Pseudo")  Branch  Points 

As  demonstrated  by  Lemma  7.2  and  Theorem  7.2,  repeated  zeros,  g^,  of 
A(z^,g)  for  |Zq|  >  1  may  not  correspond  to  unstable  branch  points.  Indeed, 
if  zf)  is  a  repeated  zero  of  A(z,g(j)  with  the  appropriate  multiplicity,  then 
z(1  is  not  a  branch  point.  So,  the  convergence  characteristics  of  the  CWS 
and  CVS  will  not  be  affected  by  its  presence.  This  key  characteristi  also 
suggests  that,  under  certain  conditions,  accurate  sequence  representations 
can  still  be  obtained  directly  from  the  GSM  algotithm  despite  the  presence 
of  unstable  branch  points. 

In  pat  titular,  the  requited  <  eruditions  correspond  to  the  situation  when 
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(defined  for  contant  g),  Dz(g),  with  g  set  equal  to  gy.  If  Dz(gy)  is 
sufficiently  small,  there  must  exist  a  point  z^  near  Zy  which  is  also  a  zero 
of  A(z,g).  For  this  situation,  there  will  be  significant  delays  in  the 
divergent  effects  of  Zy  and  z^  on  the  elements  of  the  CWS  and  CVS.  As  a 
result,  the  CSM  algorithm  will  produce  sequences  which  decay  initially  and 
which  can  be  truncated  to  establish  the  desired  finite-sequence 
representations.  Of  course,  an  upper  limit  on  the  length  of  the  truncation 
is  now  dependent  on  the  point  at  which  the  effects  of  the  unstable  branch 
points  become  apparent.  However  for  Zy  and  z^  sufficiently  close,  extremely 
accurate  representations  will  already  have  been  obtained.  Indeed,  the 
example  in  Section  7.5.2  demonstrates  the  degree  of  accuracy  which  can  be 
achieved  in  these  situations. 

Case  2:  Discrete  Fourier  Transform  (DFT)  approximations 

When  the  unstable  branch  points  of  g(z)  are  isolated  however,  their 

effects  on  the  CWS  and  CVS  will  be  immediate  and  significant.  In  these 

situations,  the  CSM  algorithm  cannot  be  used  to  obtain  the  desired  sequence 

representations,  but  useful  finite  sequences  can  be  obtained  using  a 

frequency-response-matching  (or  Discrete  Fourier  Transform,  DFT)  approach. 

s  s  s  t 

For  control  design  purposes,  the  desired  sequences,  gj ,  w.. ,  v,. ,  must 

accurately  describe  the  frequency  response  behaviour  of  the  eigenvalues  and 
eigenvectors  of  G(z).  Viewed  from  this  perspective,  a  realization  of  the 
appropriate  sequence  representations  can  be  achieved  using  an  inverse  DFT 
approach  as  discussed  in  many  digital  signal  processing  texts  (e.g.  |BEL1|). 
A  key  element  in  the  procedure  proposed  heie  is  the  use  of  the  inverse  DFT 
on  the  frequency-dependent  eigenvalues  and  eigenvectors  of  G(z).  An 
algorithm  for  calculating  g.  based  on  the  eigenvalues  of  G(z)  becomes: 

Procedure  7.1: 

(1)  Select  the  number  of  sequence  elements,  N,  appropriately  large. 

(2)  Identify  the  eigenvalues  of  G(x),  g,(z).  •••,  g  (z),  at  the  N 

«  m 


(3) 


values  of  z  given  by  z i  =  exp{j (2ni/N)} ,  i  =  0,  •••,  N-l. 

Form  the  matrix  F  and  vectors  (g^(z),  i  =  1,  •••,  m}  as  shown  here: 


-1  -N  ‘ 

* 

Z0  z0 

gi(z0> 

-1  -N 

* 

Z1  Z1 

- 

gi(zl> 

- 

gjfz)  = 

#  , 

• 

-1  -N 

- 

.  ZN-1  ‘  ‘  ‘  ZN-1 

.  gi(zN-l>  . 

(4)  Calculate  the  vectors  {g^,  i  =  1,  •••,  m}  containing  the  elements 
of  g^  using  the  relationship: 

g.  =  1/N  F*  g.(z)  ....(7.24) 

where  F  is  the  complex  conjugate  transpose  of  F. 

(5)  Truncate  the  elements  of  each  vector  g.  appropriately  to  obtain  the 
desired  sequence  representations. 

Physical  arguments  associated  with  the  frequency  response  characteristics  of 
the  quantities  involved  in  Procedure  7.1  suggest  that  the  sequences,  g^ , 
calculated  using  this  procedure  can  be  truncated.  In  particular,  when  the 
vector  g^  is  calculated  as  shown  in  eqn  7.24,  the  phase  characteristics  of 
g^(z)  and  the  structural  properties  of  the  matrix  F  will  combine  to  produce 
a  sequence  of  elements  which  will,  in  general,  decay.  Indeed,  observations 
of  several  examples  indicate  that,  for  systems  with  widely-separated 
unstable  branch  points,  the  sequences  calculated  using  Procedure  7.1  do 
decay  except  for  a  small  increasing  "tail"  over  the  last  few  elements. 
Hence  for  N  sufficiently  large,  accurate  truncations  can  be  identified. 
a  result,  although  the  elements  of  the  true  CVS  cannot  be  truncated,  the 
truncated  sequences  obtained  from  Procedure  7.1  will  generally  produr«> 
finite-series  representations  which  accurately  describe  g.(z). 

The  same  technique  can  also  be  applied  to  obtain  sequence  representa¬ 
tions  for  the  characteristic  and  dual  characteristic  directions  of  G(z).  It 

should  be  noted,  though,  that  the  eigenvectors,  w.(z),  of  G(z)  are  only 

*  £ 

unique  to  a  scalar  function  of  z;  in  other  words,  w.(z)  =  f(z)  w^(z)  where 
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f(z)  is  an  arbitrary  scalar  function.  Hence  when  the  frequency-dependent 
eigenvectors  of  G(z)  are  evaluated,  care  must  be  taken  to  ensure  that  the 
same  scaling  is  used  at  each  value  of  z.  (In  practice,  this  simply  requires 
normalizing  the  same  element  of  each  eigenvector  to  unity  over  all 
frequencies.)  Once  this  is  done,  the  frequency-dependent  dual  eigenvectors 
can  be  calculated  using  the  relationship  V(z)  =  V~^(z),  and  Procedure  7.1 
can  be  implemented  to  identify  appropriate  sequence  representations  for 
Wj(z)  and  v!(z).  Indeed,  an  example  is  presented  in  Section  7.5.3  to 
demonstrate  the  accuracy  of  the  representations  that  can  be  achieved  using 
this  approach. 

Case  3:  Constant  Precompensation 

The  inverse  DFT  method  outlined  above  will  generally  overcome  the 
problem  of  unstable  branch  points  to  produce  accurate  finite  sequence 
representations  for  the  characteristic  gains  and  directions.  Of  course, 
this  solution  relies  on  the  time-consuming  calculation  of  the  eigenvalues 
and  eigenvectors  of  G(z)  at  a  large  number  of  points  on  the  unit  circle  and, 
hence,  leads  to  a  considerable  increase  in  computational  complexity  when 
compared  to  the  CSM  algorithm.  For  this  reason,  the  approach  is  much  more 
appropriate  for  use  in  off-line  design  methodologies.  In  many  situations 
however,  the  problem  of  unstable  branch  points  can  be  solved  without 
resorting  to  the  inverse  DFT  method. 

Consider  for  a  moment,  the  multivariable  control  design  problem  from  a 
practical  perspective.  An  obvious  goal  of  the  design  process  is  to  produce 
a  closed-loop  system  with  acceptable  transient  and  steady-state 
characteristics  and  low  interaction.  Hence,  adequate  compensation  will,  in 
general,  attempt  to  produce  a  closed-loop  system  that  is  stable,  near¬ 
diagonal,  and  minimum  phase.  When  this  is  achieved,  the  z-domain 
characteristics  of  the  eigenfunctions  and  diagonal  elements  of  R(z)  will 
tend  to  become  the  same,  the  only  deviations  arising  from  the  small 

perturbations  introduced  by  the  off-diagonal  elements  of  R(z). 
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Consequently,  the  closed-loop  system  will  either  have  no  unstable  branch 
points  or,  at  worst,  unstable  branch  points  (introduced  by  the  off-diagonal 
perturbations)  which  are  very  close  together.  Since  branch  points  of  the 
open-loop  and  corresponding  closed-loop  system  are  identical,  this  suggests 
that  adequate  multivariable  compensation  will  generally  reposition  unstable 
branch  points  to  locations  inside  the  unit  circle  (or,  at  worst,  produce 
latent  unstable  branch  points). 

Based  on  these  considerations,  it  seems  reasonable  to  suggest  that  some 
simple  form  of  precompensation  can  be  used  to  reposition  unstable  branch 
points  prior  to  implementing  the  CSM  algorithm  while,  at  the  same  time, 
taking  a  step  toward  achieving  the  closed-loop  design  objectives.  Indeed, 
as  demonstrated  by  example  in  Section  7.5.1,  implementing  constant  alignment 
compensation  to  reduce  high  frequency  interaction  will,  in  many  cases, 
accomplish  this  additional  task.  More  sophisticated  "branch-point 
placement"  algorithms  are  also  possible,  although  these  have  yet  to  be 
investigated  in  detail.  Once  appropriate  precompensation  has  been 
introduced  to  eliminate  unstable  branch  points,  the  CSM  algorithm  will 
produce  CWS  and  CVS  that  can  be  used  in  multivariable  control  designs  to 
meet  desired  stability  margins  and  other  performance  objectives  based  on 
classical  SISO  design  techniques.  An  appropriate  methodology  for 
implementing  these  compensators  is  discussed  in  the  following  section. 

7.4  "Exactly"  Commutative  Control  via  Convolution 

7.4.1  General  Comments  on  Commutative  Controllers 

For  discrete-time  multivariable  systems,  the  control  design  problem 
becomes  one  of  selecting  a  (possibly  dynamic)  precompensator,  K(z),  so  that 
the  closed-loop  system  described  by  the  transle1  function  matrix 

R(z)  =  {I  +  G(z)  K(z)}-1  G(z)  K(z)  =  {I  ♦  O(z)}-1  Q(z)  ....(7.25) 
behaves  in  an  appropriate  manner.  Returning  to  the  frequency-domain 
approach  implied  by  the  characteristic  locus  method  and  discussed  in  Chapter 
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2,  it  can  be  seen  that  the  properties  of  the  closed-loop  system  above  are 
directly  related  to  the  frequency-dependent  eigenvalues  and  eigenvectors  of 
the  open-loop  transfer  function  matrix,  Q(z)  =  G(z)K(z).  Thus,  desired 
closed-loop  response  can  be  obtained  by  choosing  K(z)  to  manipulate  these 
quantities  appropriately.  In  essence,  the  CLM  reduces  the  multivariable 
design  problem  to  a  set  of  SISO  problems  without  resorting  to  ad  hoc 
attempts  to  diagonalize  the  plant.  So,  the  ultimate  design  can,  in  fact,  be 
based  on  classical  frequency  response  techniques. 

Unfortunately,  the  selection  of  K(z)  is  a  rather  challenging  task 
because  it  is  not  generally  possible  to  relate  the  eigenvalues  and 
eigenvectors  of  Q  to  the  individual  eigenvalues  and  eigenvectors  of  G  and  K 
in  a  systematic  manner.  If,  however,  G  and  K  share  a  common  set  of 
eigenvectors,  the  eigenvalues  of  0  are  the  products  of  the  individual 
(corresponding)  eigenvalues  of  G  and  K  and  direct  compensation  of  each 
eigenvalue  is  possible.  When  K(z)  is  designed  in  this  manner,  G  and  K 
commute  (i.e.  G(z)  K(z)  =  K(z)  G(z)>;  so  K(z)  is  often  referred  to  as  a 
commutative  controller.  But  as  implied  by  algebraic  function  theory,  the 
functions  g.(z),  w.(z),  and  vj(z)  associated  with  G(z)  are  generally 
irrational  functions  of  z  and  so  it  is  not  possible  to  construct  a 
realizeable  controller  K(z)  that  commutes  exactly  with  G(z)  for  all  values 
of  z.  As  a  result,  appropriate  approximations  for  U(z)  and  V(z)  must  be 
generated  to  accomplish  this  design  task. 

In  the  past,  the  use  of  rational,  s  domain  (nr  7  domain)  approximations 
for  V  and  V  has  been  rejected  on  the  grounds  that  the  increased  complexity 
of  the  resulting  contioller  produces  unacceptable  designs.  So,  past  effotts 
have  concentrated  on  pioducing  constant  appi ox i m., t ions  fot  V  and  V  which  uic 
correct  ovet  limited  frequency  langes  |Kotll|.  |K.I)M1|,  |K<>IH|.  Cleat  ly,  the 
effectiveness  of  the  insulting  conttol  designs  depends  on  the  accutacy  of 
the  appioximat ion;  but  mote  impot taut ly.  the  insults  obtained  will  onlv  he 
valid  ovet  a  narrow  range  of  frequencies.  As  a  tesult,  the  SISo  design 
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methodology  associated  with  the  CLM  becomes  a  more  complex  task  requiring  a 
combination  of  several  frequency-dependent  compensators.  In  addition,  the 
specified  modifications  to  the  individual  characteristic  loci  (required  to 
achieve  desired  closed-loop  characteristics)  will  only  be  accurately 
reproduced  at  frequencies  where  the  constant  approximation  is  accurate;  a 
particularly  significant  restriction  for  systems  with  rapidly-changing 
frequency-dependent  eigenvectors.  As  such,  the  use  of  constant 
approximations  for  V  and  V  may  significantly  reduce  the  effectiveness  of  the 
resulting  control  design. 

Until  recently,  concerns  about  design  complexity  have  justified  these 
constant  approximations  because  the  use  of  serial  algorithms  for  compensator 
implementations  translates  complexity  directly  into  excessive  computation 
time.  However  with  the  rapid  advances  being  made  in  Very  Large  Scale 
Integration  (VLSI)  technology  and  parallel  processing  algorithms  (e.g. 
( KUN1 |  or  (KUN2J),  so-called  increased  complexity  may  no  longer  be  a  valid 
reason  to  resort  to  such  simple  approximations.  Indeed,  by  discarding  the 
standard  perspective  of  implementing  difference  equation  controllers  and 
instead  focusing  on  "convolution  controller"  implementations  using  the 
finite-series  representations  developed  previously,  the  commutative 

controller  problem  can  be  reformulated  as  one  that  relies  solely  on  parallel 
operations  involving  large-dimensioned  matrices.  Existing  computer 
capabilities  can  then  be  used  (at  no  extra  computational  expense)  to  produce 
neatly  exact  replicas  of  V  and  V  ovet  all  frequencies.  Under  these 
circumstances,  the  control  engineet  will  no  longet  be  bound  to  eigenframe 
appt oximat ions  at  a  single  ftequeiuy  01  a  limited  set  of  frequencies  o>t 
hence,  modi  f  if  at  ions  to  'In  indi'idnal  cub'-y.  t  <  ms  <  an  be  embedded 
"exactly"  <  ommutat  i  .  *•  (ontiollvi  design  to  establish  a  complete 
mu  1 1 i vat i ahl e  design  methodology  based  '.nlclv  on  classical  SISO  ftequenrv 
t espouse  methods.  Two  on  Inn  algoiitlmc  t  •  >  i  implementing  this  cont toilet 


ate  ift'sr  i  i  bed  below. 


7.4.2  A  Three-Stage  Convolution  Algorithm 

Consider  the  standard  closed-loop  system  shown  in  Figure  7.1.  A  three- 
term  representation  for  a  controller  based  on  the  CSM  results  described 
previously  will  consist  of  the  following  components:  VT(z),  VT(z),  and 
A^(z),  where  W^,(z)  and  VT(z)  are  the  truncated  series  representations  of  U 
and  V  (consisting  of  nv  and  ny  terms  respectively)  and  A^(z)  is  a  diagonal 
(difference  equation)  compensator  designed  to  modify  the  individual  charac¬ 
teristic  loci  based  on  classical  frequency  response  considerations.  For 
this  situation,  the  compensated  closed-loop  system  takes  the  form  shown  in 
Figure  7.2,  and  by  examining  the  signals  e,  e^,  ej,  and  u  in  more  detail,  an 
appropriate  implementation  algorithm  can  be  established  for  this  controller. 
First,  consider  the  transfer  function  V^,(z).  By  definition, 

e^(z)  =  V^,(z)  e(z),  and  the  inverse  z-transform  of  this  relationship  yields 

s  s  s 

the  convolution  relationship  e^  =  V^,  *  e.  Bence,  on-line  calculation  of 

the  sequence  se^  simply  requires: 

(i)  the  (possibly  parallel)  implementation  of  a  series  of  matrix- 
vector  multiplications  given  by: 

V1 

e  (k)  =  l  V  (j)  e(k-j)  ....(7.26) 

1  j=0  T 

s  s 

(ii)  storage  of  the  ny  elements  of  and  the  last  ny  elements  of  e. 

Similar  relationships  hold  for  V,p(z)  yielding  the  following  relationship 

between  the  controller  output,  u,  and  e£: 

n  -1 
w 

u(k)  =  L  V  (j)  e,(k-j)  ....(7.27) 

j=0 


Figure  7.1:  Standard  Closed-Loop  Configuration 
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Finally,  the  implementation  of  the  diagonal  controller  can  be  a< c omp 1 i shed 
via  standard  S1S0  difference  equations.  Mote  specifically,  if  the 
diagonal  element  of  <  z )  is  given  by: 

1 

I  n  <])  z  J 

J'V 

L  d  <  J »  z  > 

j-0 

with  d^(0)  «  1,  then  the  time  domain  relationship  between  the  i**1  element  of 
e^  and  the  ith  element  of  e^  is: 

pi  1  «t  1 

«,  <k)  *  I  n  (j)  e  (k  ))  t  d.(j)  e,  (k  J )  ...  <7  78) 

i  j-0  1  l  j.l  1  M 

The  control  algorithm  described  above  can  clearly  be  performed  in  three 
distinct  stages;  the  first  and  last  involving  n^  and  n^  matrix  vector 
multiplications  respectively  and  the  second  involving  only  scalar 
multiplications.  In  addition,  the  calculations  required  at  each  stage  can 
be  performed  using  appropriate  convolutions.  Thus,  parallel  processing  can 
be  applied  not  only  to  each  individual  mat r ix- vee tor  multiplication  but  also 
to  the  entire  convolution  summation.  Using  VLSI  technology,  the  resulting 
algorithm  should,  therefore,  yield  a  computat lonally-ef f iclent  parallel 
implementation.  Furthermore,  when  ny  and  ny  are  small  (as  has  been  observed 
for  a  large  number  of  systems),  serial  iapleswntat ions  will  also  yield 
adequate  results.  These  observations  suggest  that  the  algorithm  highlighted 
above  can  be  applied  on-line  to  a  vide  variety  of  multivariable  systesw. 


n  <  z  ) 

\  ( z )  -  • 

ii  djfz) 


K£z) 


Figure  7.2:  Closed-Loop  Configuration  for  Three-Stage  Convolution  Control 
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;.4.i  A  Single  Stage  Convolution  Algol ithm 

Although  many  of  the  ra  1 '  n  I  a  t  i  on«;  in  the  three-stage  algorithm  can  he 
pet  formed  simultaneously,  the  existence  of  three  separate  stages  suggests 
that  some  amount  of  setial  computation  is  still  required,  and  this  leads  to 
an  increase  in  the  ovetall  computation  time  associated  with  each  controller 
output.  If  the  diagonal  controller  A^(z)  is  viewed  from  a  different 
perspective  however,  many  of  the  temaining  serial  calculations  can  be 
eliminated  and  on  line  implementations  can  be  speeded  up  still  further. 

To  establish  this  alternative  algorithm,  consider  the  following.  If 
Ay(z)  is  stable,  it  too  can  he  written  as  an  infinite  power  series  with 
de<aving  coefficients.  In  this  case,  the  complete  compensator  K  can  be 
expressed  in  sequence  form  as: 


s 

Since  ail  three  sequences  on  the  right  hand  side  decay,  the  elements  of  K 
must  do  the  same.  Thus,  K(z)  ran  be  accurately  approximated  by  a  truncated 
power  series  (with  nfc  terms);  and  its  output  at  sample  k  will  be  given  by: 

nk  1 

u(k)  -  £  K(j)  e(k  j)  - (7.29) 

J-0 

On  line  calculation  of  the  control  signal  now  involves  only  a  single  set  of 
matrix  vector  multiplications  instead  of  the  three-stage  calculation 
suggested  previously. 

Of  course,  a  single  stage  implementation  requires  SA^  to  be  a  decaying 
sequence  which  implies  that  A^(z)  cannot  include  integral  action.  If 
integral  action  is  desired,  a  diagonal  compensate!  of  the  form 

z  -  z. 

K.(z)  *  f(z)  I  where  f(z)  ,  - _ (7.30) 

l  z  -  1 

can  be  inserted  into  the  forward  loop  without  changing  any  of  the  other 
compensator  terms,  and  the  simple  form  of  this  additional  compensator 
ensures  that  any  increase  in  computation  time  is  negligible.  It  should  be 
noted  that  the  use  of  scalar  integral  control  as  suggested  by  eqn  7.30 
places  absolutely  no  restrictions  on  the  final  design.  Indeed,  all  other 
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characteristic  locus  modifications  (such  as  gain  balancing)  can  be  achieved 
through  the  appropriate  selection  of  A^.  Furthermore,  if  integral  control 
is  not  required  in  all  loops,  A^  can  be  adjusted  to  selectively  eliminate 
the  integral  action  in  the  loops  vhich  do  not  require  it.  Thus,  the  desired 
controller  can  be  implemented  as  shown  in  Figure  7.3. 

7.4.4  Additional  Truncation  Considerations 

For  the  implementations  suggested  above,  the  truncation  levels  nv,  nw, 

and/or  n^  become  design  variables  to  be  selected  by  the  control  engineer. 

Of  course,  the  primary  goal  in  the  selection  of  ny  and  ny  is  to  achieve 

accurate  approximations  for  the  eigenvector  and  dual  eigenvector  functions 

of  the  open-)rir»p  plant.  But  additional  truncation  level  constraints  will  be 

imposed  by  the  computation  times  required  to  generate  the  compensator  output 

u  (for  serial  implementations)  or  the  hardware  required  to  store  the 

sequence  parameters  (for  parallel  implementations).  As  discussed  in  Section 

7.2.3,  accurate  eigenframe  representations  can  typically  be  achieved  with 

s  s 

truncations  that  are  much  smaller  than  those  associated  with  A  or  G.  So 
in  most  cases,  these  additional  constraints  will  not  play  a  significant  role 
in  the  selection  or  accuracy  of  the  desired  compensator. 

It  should,  however,  be  noted  that  further  reductions  in  nv  and  ny  may 
be  possible  depending  on  the  nature  of  the  resulting  closed-loop  system.  As 
is  clearly  evident  in  the  single-stage  algorithm  above,  V^,(z)  and  V^.(z)  are 
primarily  required  to  alter  the  transient  response  of  the  closed-loop 
system.  As  a  result,  the  settling  time  achieved  in  the  ultimate  closed-loop 


Figure  7.3:  Closed  Loop  Configuration  for  Single  Stage  Convolution  Control 


design  may  be  used  as  a  guide  to  reducing  these  truncation  levels  further. 
To  see  this  more  clearly,  consider  the  situation  where  the  desired  settling 
time  of  the  controlled  plant  is  pT.  It  is  easy  to  show  that  system  response 
for  T  <  t  <  pT  to  a  pulse  at  time  zero  is  completely  specified  by  the  first 
p  elements  of  the  weighting  sequences  SV,  EW,  SA^,  and  SG.  Since  all 
transient  behaviour  dies  out  by  t  -  pT,  the  compensator  sequence  SK  need  not 
have  more  than  p  elements  and,  because  SK  =  SW  *  SA  *  SV,  this  condition 
implies  that  the  individual  sequences  themselves  need  not  have  more  than  p 
elements.  Thus  depending  on  the  final  clo'ed-loop  design,  the  computation 
time  requirements  (for  serial  implementations)  or  the  storage  requirements 
(for  parallel  implementations)  may  be  reduced  still  further,  based  on  these 
practical  considerations,  to  guarantee  timely  on-line  results. 

7.5  Simulation  Results  and  Discussion 

7.5.1  Design  Results  using  Characteristic  Subsystem  Descriptions 

A  two-input/two-output  discrete- t ime  system  described  by  the  transfer 
function  matrix 

.75(z3+.27z2-.39z  -.08)  -,29(z3  .40z2-1.07z  -.16) 

z4  1.33z3+.llz2+.29z  -.03  z4-l . 22z3- .09z2+ . 34z 

G(z)  =  ,9  ....(7.31) 

-.21(z+.07z-1.34z  -.62)  ,57(zV  78z  -.26z  -.23) 

.  z4-1.65z3+.52z2+.26z  -.11  z4~l. 14z3-.30z2+.57z  -.09  . 

was  used  to  demonstrate  the  design  methodology  preposed  in  this  chapter.  An 
examination  of  the  characteristic  loci  and  the  alignment  between  the 
characteristic  directions  and  the  standard  basis  vectors  for  this  system 
indicated  the  existence  of  extremely  poor  stability  margins  and  significant 
interaction  effects,  and  these  conditions  are  validated  by  the  closed  loop 
step  responses  shown  in  Figure  7.4.  These  results  clearly  demonstrate  the 
need  for  compensation  to  achieve  adequate  closed-loop  response. 

The  design  of  an  appropriate  multivariable  compensator  for  this  system 
was  accomplished  in  two  steps.  First,  a  constant  precompensator,  was 
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identified  to  align  the  characteristic  directions  to  the  standard  basis 
vectors  at  high  frequencies  and,  hence,  to  eliminate  high  frequency 
interaction.  Using  tire  ALIGN  algorithm  |K0U1),  was  selected  so  that 
G(z  )  =  I  at  z  =  exp(  j  If)  (i.e.  wT  90°).  The  resulting  pt ec ompensa t or 


is  given  by: 


2 . 10b  1 . 068 

1.18?  ?.bb9 


.  .  <  1?) 


and  its  implementation  produced  misalignment  angles  of  less  than  12°  over 
the  frequency  range  90°  <  WT  <  180°  as  compared  to  misalignment  angles  of 
15°  to  45°  for  the  original  system.  It  should  also  be  noted  that  the 
Implementation  of  not  only  achieved  the  desired  alignment 
characteristics,  hut  it  also  repositioned  the  two  unstable  branch  points 
(z  •  1.31  and  z  1.108)  of  G(  z  ) .  In  fact,  the  magnitude  of  each  of  the  30 
branch  points  of  GK^  was  less  than  0.92.  As  a  result,  the  CSM  algorithm 
vas  Implemented  for  the  system  G(z)  to  produce  CVS  and  CVS  whose  elements 
decay.  These  sequences  ate  displayed  in  Figs  7.b  through  7.7. 

With  series  representaf ions  for  V(z)  and  V(z)  available,  the  second 


a)  unit  step;  input  1 


Figure  7.4:  Time  Response  of  the  Uncompensated  Closed-Loop  System 


175 


Figuie  /  .  *> :  op*n  loop  (  Inn  a<  I  »•!  i  «;  t  l<  Weighting  S*”;.*" 


Figure  7.6:  Open-Loop  Characteristic  Vector  Sequences 


Figuie  1.1:  Open  l.oop  l>n.i  1  '  h.u  1 01  i  •;  t  i  c  Vector  Sequences 
step  of  the  compensator  design  prmt  ■  >.  focused  strictly  on  modifying  the  two 
individual  characteristic  loci  to  obtain  desired  stability  margins  in  each 
loop.  This  was  accomplished  using  classical  SISO  frequency  response 
methods,  and  the  following  diagonal  compensator  was  identified: 

.2  0 

A.  (z)  -  ....(7.33) 

.236  (z  -  .686) 

(z  -  .364) 

where  the  negative  gain  in  loop  2  was  required  simply  to  reorient  the 
characteristic  locus  associated  with  this  loop.  \(z)  vas  then  combined 
with  5-term  series  representations  for  the  characteristic  and  dual 
characteristic  directions  (obtained  from  the  CVS  and  dual  CVS)  to  establish 
the  desired  commutative  controller,  K(z)  =  VT(z)  A^z)  V^,(z).  [The 
characteristic  loci  of  the  system  before  and  after  the  introduction  of  K(z) 


177 


aie  piesented  foi  compai  ison  in  Figme  7.H.  A  moie  detailed  compaiison  will 


be  discussed  shoitly-l  The  rnmplrt<>  dyn.imii  <  ompensa  t  nt  (defined  bv  F(z)) 
was  then  implemented  using  a  thtee  stage  algmithm.  lr  addition,  a  scalar 
integral  compensatoi 


Kj(z) 


1.0b  (.'  .01 ) 

(z  1) 


was  added  to  the  design  to  eliminate  steady  state  errors.  The  resulting 
closed  loop  step  responses  ate  displayed  in  Figure  7.9  and,  as  desired, 
acceptable  transient  response  has  been  obtained,  steady-state  error  has  been 
eliminated,  and  interaction  is  negligible. 

To  highlight  the  accuiacy  of  the  finite  series  implementation  suggested 

here,  the  chat ac tet i s t i c  loci  of  the  compensated  system.  (GK^) {U^A^V^.} ,  were 

compared  to  the  desired  characteristic  loci  (defined  by  the  diagonal  matrix 

( ) ) .  The  results  of  this  comparison  over  the  entire  frequency  range 
h 

(<>  <  r*»T  <  n)  ate  presented  in  Figure  7.10  as  pot  rentage  differences  between 
the  actual  and  desired  for  three  different  truncation  levels.  As  expected, 
the  level  of  truncation  selected  has  a  significant  effect  on  the  accuracy  of 
the  approximation.  However,  reasonable  truncations  (in  this  case,  5  to  10 
terms)  do  produce  extremely  accurate  results  over  all  frequencies.  Figure 
7.10  also  presents  results  obtained  by  implementing  A^(z)  using  an  an 
approximately  commutative  controller  (ACC).  For  this  study,  the  ACC  was 
designed  (using  standard  techniques  (MAC3])  to  achieve  commutativity  at  the 
gain  crossover  frequency  (coT  =  33°)  and  produced  a  controller  of  the  form 
K(z)  =  WA  A^(z)  VA,  where  VA  and  VA  are  the  constant  approximations  obtained 
for  V(z)  and  V(z)  at  wT  =  33°  and  are  given  by: 


VA  = 

’  0.9879 

0.1985  ‘ 

% 

V.  = 

’  1.0360 

-0.1983  ' 

A 

.  0.1173 

1.0370  . 

A 

.  -0.1172 

0.9869  . 

This  additional  information  demonstrates  that  each  of  the  three  convolution 
controllers  performed  significantly  better  than  the  ACC  over  all  frequencies 
including  the  frequency  where  the  ACC  was  designed  to  perform  best. 

The  conclusions  that  can  be  drawn  from  these  simulation  results  are 
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a)  Characteristic  Loci  of  GK. 


b)  Characteristic  Loci  of  GK^K 


Figure  7.8:  Open-Loop  Characteristic  Loci 
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clear.  Reasonable  truncations  of  the  CVS  and  dual  CVS  produce  extremely 
accurate  approximations  for  the  characteristic  directions  of  the  system. 
Indeed,  as  shown  in  Figure  7.10,  the  differences  for  5-  and  10  teim 
truncations  are  negligible.  As  a  result,  "exactly"  commutative  controllets 
can  be  designed  using  the  characteristic  subsystem  reptesentat ions  desciibed 
in  this  chapter,  and  modifications  to  the  characteristic  loci  of  the  system 
(as  prescribed  by  a  classical  SISO  analysis)  can  be  achieved  simul taneously 
over  all  frequencies  using  a  single  compensatoi  design. 

7.5.2  A  Latent  Branch  Point  Example 

The  case  of  latent  unstable  branch  points  described  in  Section  7 .  t . 
was  also  investigated  via  simulation.  As  mentioned  above,  G(z)  (given  by 
eqn  7.31)  has  branch  points  at  z  =  1.31  and  ?  1.108.  The  proximity  of 

these  branch  points  to  one  another,  however,  suggests  that  their  effects  on 
the  CVS  and  CVS  will  be  significantly  delayed,  and  so  the  CSM  algorithm 
should  produce  sequences  which  decay  initially.  Indeed,  for  this  example, 
the  effects  of  these  latent  unstable  branch  points  did  not  become  noticeable 

until  after  the  tenth  terms  in  SV  and  SV.  In  fact,  an  element -by-element 

s  s 

comparison  of  the  sixth  terms  is  V  and  V  to  the  first  (and  largest)  terms 

in  these  sequences  is  presented  in  Table  7.1  and  clearly  demonstrates  that 

accurate  5-term  sequence  representations  for  V  and  V  were  available. 

A  convolution  controller  was  implemented  using  these  5  term  sequence 
representations  together  with  the  following  diagonal  compensator: 

'  .62  0 

V2)  = 

n  1-93  (z  -  . t>3 ) 

.  (z  f  .0855)  . 

For  purposes  of  comparison,  an  ACC  was  designed  using  the  same  diagonal 
compensator  and  the  following  constant  precompensators: 


'  0.7499 

0.6 

’  0.8359 

-0.5869  ‘ 

- 

VA  = 

.  -0.6358 

0.8547  . 

A 

.  0.6218 

0.7334  . 

(where  and  were  selected  to  approximate  V  and  V  at  the  gain  crossover 

frequency,  <oT  =  55°).  Figure  7.11  presents  the  percentage  error  between  the 
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Element 

Vi’ 

v  ,0) 

’u{y 

vo> 

1  1 

.001 

.  0O« 

12 

.  007 

.001 

21 

.007 

.008 

27 

.001 

.  002 

Table  7.1:  A  Comparison  of  the  Elements  of  SV  and  SV 

actual  compensated  charac ter  is t i c  loci  and  the  desited  characteristic  loci 
for  these  implementations.  Again,  the  results  cleat ly  demonstrate  the 
ef fee t i veness  of  convolution  inplementat ions  to  achieve  "exactly" 
commutative  control.  Furthermore,  they  suggest  that  the  CSM  algorithm  can. 
indeed,  be  implemented  to  produce  useful  f ini te- sequence  representations  in 
the  presence  of  unstable,  but  latent,  branch  points. 

It  should  be  noted  that  the  accuracy  of  any  given  finite-sequence 


Figure  7.11:  Characteristic  Locus  Comparison 
(Latent  Branch  Point  Example) 
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implementation  is  strongly  influenced  by  the  characteristics  of  the  original 


system.  As  such.  it  is  not  genetally  possible  to  draw  conclusions  from  a 
compai ison  of  results  for  different  systems.  Thus,  although  the  convolution 
controller  designed  for  this  latent  branch  point  example  appears  to  be  more 
accurate  than  the  controller  for  the  example  in  Section  7.5.1,  one  must  not 
conclude  that  the  introduction  of  latent  unstable  branch  points  is  a  good 
thing  to  do.  Indeed,  in  the  absence  of  unstable  branch  points,  the  CVS  and 
dual  CVS  will  decay,  and  this  guarantees  that  more  accurate  approximations 
can  be  obtained  simply  by  increasing  the  number  of  terms  in  the  truncated 
sequences,  a  procedure  that  cannot  be  used  when  latent  unstable  branch 
points  exist . 


7.5.  t  Design  Results  using  DFT  Approximations 

A  final  simulation  was  conducted  to  demonstrate  the  use  of  the  finite 

sequences  obtained  using  the  inverse  DFT  approach  described  in  Section 

7.1.7.  The  system  used  for  this  study  was  obtained  by  combining  G(z)  (eqn 

n.095  .5681 


7.11)  with  the  constant  precompensator  S  - 


(where  S  was 


L  .183  1.085 


selected  solely  to  generate  widely- separated  unstable  branch  points).  The 
new  transfer  function  matrix,  t'j  -  G  S,  was  found  to  have  unstable  branch 
points  at  z  =  10. A  and  z  *  1.8  and,  as  expected,  the  CSM  algorithm  produced 
sequences  which  diverge  immediately. 

The  inverse  DFT  algorithm  was,  therefore,  implemented  with  N  =  180  to 
generate  appropriate  sequence  representations  for  the  eigenvectors  and  dual 
eigenvectors  of  G^ .  As  mentioned  previously,  appropriate  scaling  of  the 
frequency-dependent  eigenvectors  must  be  accomplished  prior  to  implementing 
the  algorithm.  So  for  this  case,  the  diagonal  elements  of  W(z)  were 
normalized  to  unity  at  each  frequency,  and  V(z)  V_1( z)  was  then  computed. 
Using  the  resulting  frequency-dependent  information,  sequence 
representations  of  the  form: 
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II 

5* 

W 

s  s 

e  w12 

SV  = 

\2' 

s  s 

21  e 

.  Sy21 

Sy22  . 

g 

were  produced  (as  suggested  by  Procedure  7.1)  where  e  =  {1,  0,  •••  }  and 
the  first  six  terms  of  the  remaining  sequences  are  listed  in  Table  7.2. 

A  convolution  controller  was  then  implemented  using  five-term 
approximations  for  W  and  V  and  the  following  diagonal  compensator: 


For  purposes  of  comparison,  an  ACC  was  designed  using  the  same  diagonal 
compensator  and  the  following  constant  precompensators: 


’  1.0190 

0.3117  ■ 

V,  = 

0.8081 

-0.2769  ' 

.  -0.6381 

0.9096  . 

A 

.  0.5668 

0.9051  . 

(where  and  were  selected  to  approximate  W  and  V  at  the  gain  crossover 
frequency,  <oT  =  55°).  Figure  7.12  presents  the  percentage  difference 
between  the  actual  compensated  characteristic  loci  and  the  desired 
characteristic  loci  for  these  implementations.  This  comparison  suggests 
that,  although  the  improvements  in  accuracy  are  not  as  dramatic  as  seen 
previously,  significant  improvements  may  still  be  obtained  over  wide 
frequency  ranges  (including  the  frequency  at  which  the  ACC  was  designed  to 
perform  best)  using  the  finite-sequence  representations  generated  by  the 


k 

aw 

w12 

W21 

5 

V11 

S  , 

'12 

5 

V21 

v22 

0 

0.2348 

-.4128 

0.8105 

-.Z041 

0.2851 

0.8105 

1 

-.1451 

-.3725 

-.0311 

0.1141 

0.2863 

-.0311 

2 

-.0026 

-.0020 

0.0338 

-.0056 

0.0009 

0.0338 

3 

0.0214 

-.0304 

-.0109 

-.0153 

0.0343 

-.0109 

4 

-.0217 

0.0159 

0.0085 

0.0171 

-.0145 

0.0085 

5 

0.0200 

-.0053 

-.0025 

-.0192 

0.0092 

-.0025 

Table  7.2:  Elements  of  SV  and  SV  obtained  form  Inverse  DFT  Algorithm 
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inverse  DFT  method.  Hence  in  situations  where  unstable  branch  points  cannot 
be  avoided  by  precompensation,  it  is  still  possible  to  obtain  accurate 
convolution  controllers. 

The  design  methodology  described  and  demonstrated  in  this  chapter 
offers  the  exciting  potential  to  produce  an  entire  range  of  multivariable 
algorithms  which  take  advantage  of  the  tremendous  computing  powers  currently 
being  developed.  Indeed,  using  this  methodology,  it  should  be  possible  to 
extend  the  concept  of  SISO  expert  system  design  based  on  frequency  response 
information  (e.g.  (JAM1])  to  multivariable  problems.  In  addition,  the 
methodology  offers  the  potential  for  implementing  multivariable  self-tuning 
algorithms  in  a  true  generalized-Nyquist  setting  by  combining  system 
identification  with  either  on-line  frequency  response  design  methods  or 
existing  SISO  self-tuning  design  algorithms.  One  such  algorithm  is 
developed  in  the  next  chapter. 


Figure  7.12  Characteristic  Locus  Comparison 
(Unstable  Branch  Point  Example) 
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Appendix  7.1:  The  Standard  CSM  Algorithm  [K0U2] 

t  Vi 

Let  g^(0)  and  w^(0)  denote  the  1  eigenvalue/eigenvector  pair  of  G(O). 
To  obtain  the  values  of  g^(k)  and  w^(k)  for  any  k  >  0,  first  form  the 
matrix: 

k-1 

M(k)  =  G(k)  W(0)  +  Z  iG(k-j)  W(j)  -  V(j)  A(k-j)]  ....(A7.1) 

j=l 

where  A(j)  =  diag  tgjCj),  •••,  gm(j)]>  and 

W(j)  =  f  WjCj)  |  v2(j)  |  |  vm(j)]. 

(Note:  if  k=l,  the  summation  term  in  eqn  A7.1  should  be  dropped.)  Then, 

denoting  the  ith  column  of  M(k)  by  m.(k),  calculate  g^k)  and  w^(k)  using 
the  relationships: 

g.(k)  =  vj(0)  m.(k)  . . . . (A7.2a) 

v.(k)  =  T*f(0)  m.(k)  +  a. (k)  w.(0)  ....(A7.2b) 

where  a^(k)  is  an  arbitrary  constant  (which,  for  convenience,  may  be  set  to 

+■ 

zero)  and  T^(0)  is  the  commuting  g2-Penrose  inverse  of  lg^(0)I  -  G(0) ]  given 
by: 

*  m  1  t 

tT(0)  =  Z  - ± -  w  (0)  v  WO)  . . . .  (A7 .3) 

j=l  gjfO)  -  g.(0)  J  J 

j*i  1  J 

Appendix  7.2:  Modifications  to  the  Standard  CSM  Algorithm 

As  suggested  by  the  developments  in  Sections  7.1  and  7.2,  situations  do 
exist  for  which  the  standard  CSM  algorithm  presented  in  Appendix  7.1  cannot 
be  used  to  generate  the  CVS  and  CVS.  In  particular,  the  following  three 
special  situations  may  arise: 

Case  1:  G(0)  (the  first  element  of  the  plant  weighting  sequence)  has 

repeated  eigenvalues  associated  with  a  simple  Jordan  form; 

Case  2:  the  characteristic  polynomial  A(z,g)  is  reducible  to  linear 
factors  and  G(Zq)  has  repeated  eigenvalues  associated  with  a 
nonsimple  Jordan  form  for  some  |zq|  >1  and  Zq*  *  0;  and 
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Case  3: 


the  characteristic  polynomial  A(z,g)  is  reducible  to  linear 
factors  and  G(0)  (i.e.  G(Zq*=0))  has  repeated  eigenvalues 
associated  with  a  nonsimple  Jordan  form. 

In  each  instance,  modifications  to  the  standard  CSM  algorithm  can  be 
developed  to  generate  appropriate  sequence  representations  for  g^(z),  w^(z), 
and  vj(z).  These  modifications  are  derived  below.  It  must,  however,  be 
stressed  that  the  cases  above  rarely  occur  in  practice  and  so  the 
developments  here  are  presented  primarily  for  completeness. 


Case  1:  Simple,  Repeated  Eigenvalues  of  G(0) 

When  A(z,g)  is  irreducible  and  G(0)  has  repeated  eigenvalues  associated 
with  a  simple  Jordan  form,  the  standard  CSM  algorithm  cannot  be  used  to 
generate  all  of  the  CWS  and  CVS.  Yet  in  the  absence  of  unstable  branch 
points,  decaying  sequences  do  exist  (as  suggested  in  Sections  7.2.2  and 
7.2.3).  So,  an  alternative  algorithm  must  be  developed  to  handle  this 
special  problem.  Ray  has  examined  a  similar  problem  [RAY1 ] ;  however,  his 
proposed  modifications  handle  only  the  situation  when  A(z,g)  is  reducible  to 
linear  factors.  For  the  more  general  case,  the  following  modifications  are 
required. 

Since  G(0)  has  simple,  repeated  eigenvalues,  it  can  be  written  in  the 
following  form: 

G(0)  =  V(0)  A(0)  V(0)  _ (A7.4) 

where  A(0)  =  diag  (A^O),  •••,  7^(0),  A  j(0),  •••,  Am(0)}  ;  p  >  1 
V<0)  =  [  Wp(0)  |  vp+1(0)  |  •••  |  wm(0)] 

V(0)  =  [  Vp(0)  |  vp+1(0)  |  •••  |  vm(0)lT 

and  Vp(0)  and  Vp(0)  are  both  dimensioned  [m  x  pj  and  satisfy  the  equations: 

{A^OI  -  G(0)}  Vp(0)  =  0  v£(0)  { A1(0)I  -  G ( 0 ) )  =  0  . . . .  (A7.5a,b) 

Vj(°)  Vp(0)  =  I  (A7.5c) 

with  rank  (Wp(0)}  =  rank  (Vp(0))  =  p.  The  CVS  and  CVS  associated  with  the 


187 


nonrepeated  eigenvalues  of  G(0)  can  still  be  calculated  using  the  standard 
CSM  algorithm.  However,  modifications  are  required  to  calculate  the  CVS  and 
CVS  associated  with  the  repeated  eigenvalues.  To  begin,  any  vector  of  the 
form 

w(0)  =  Vp(0)  u(0)  _ (A7.6) 

(where  u(0)  is  an  arbitrary  vector  of  dimension  p)  satisfies  eqn  A7.5a.  Now 
consider  the  CVS  and  CVS  associated  with  one  of  the  repeated  eigenvalues, 
A^(0).  Let  w^(0)  denote  the  first  element  of  this  CVS  and  X^(l)  denote  the 
second  element  of  this  CVS.  Then  from  eqn  A7.6,  v^(0)  can  be  written  as: 

w.(0)  =  Vp(0)  u.(0)  . . . . (A7 .7) 

Substituting  eqn  A7.7  into  the  convolution  equations  defined  by  eqn  7.1 
produces  the  following  result  at  the  second  stage  of  the  convolution: 

G ( 1 )  Vp(0)  u.(0)  +  G(0)  w.(l)  =  X.(l)  V  (0)  uf (0)  +  Xj(0)  w.(l)  ....(A7.8) 

T 

Multiplying  by  Vp(0)  then  produces: 

Vp(0)  G(l)  Wp(0)  u.  (0)  =  X.(l)  u.(0)  +  V^OHAjfO)!  -  G(0)}w.(l)  . . . .  (A7 .9) 

But  Vp(0)  (Xj(O)I  -  G( 0) ]  =  0,  so  eqn  A7.9  reduces  to: 

(Vp(0)  G(l)  Vp(0)}  u.(0)  =  X.(l)  u.(0)  ....(A7.10) 

Thus,  Uj(0)  and  X^(l)  are  the  eigenvectors  and  eigenvalues  of 
T 

Vp(0)  G(l)  V  (0),  and  w^,(0)  is  uniquely  determined  by  eqn  A7.7.  Once  w.(0) 
is  known  for  all  i=l,  •••  ,p,  the  standard  CSM  algorithm  can  be  implemented 
without  further  changes  to  calculate  the  remaining  terms  in  each  of  the  p 
CVS  and  CVS  provided  the  commuting  g2~Penrose  inverse  of  (X..(0)I  -  G(0)]  for 
i=l,  •••  , p  is  calculated  as: 

*  m  w.(0)  v , (0) 

(^,(0)1  -  G(0) | '  =  ....(*7.11) 

Clearly  if  some  of  the  eigenvalues  of  Vp(0)  G(l)  Wp(0)  are  repeated  and 
associated  with  a  simple  Jordan  form,  the  above  procedure  can  be  reapplied 
at  the  next  stage  of  the  convolution  to  establish  the  desired  unique 
representations  for  w^(0).  It  should  be  noted  that  this  algorithm  also 
applies  when  all  of  the  eigenvalues  of  G(0)  are  repeated.  In  this  situation 
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however,  it  can  be  simplified  considerably  since  W^  =  Vp  =  I.  Hence,  w.(0), 
i = 1 ,  •••  ,m  are  simply  the  eigenvectors  of  G(l). 

Case  2:  Nonsimple,  Repeated  Eigenvalues  of  G(Zq)  [Zq*  *  0] 

A  second  situation  for  which  the  standard  CSM  algorithm  cannot  be 

applied  is  when  A(z,g)  is  reducible  to  linear  factors  and  G(Zq)  has  repeated 

eigenvalues  associated  with  a  nonsimple  Jordan  form  for  some  |Zq|  >  1.  As 

discussed  in  Section  7.2.4,  this  situation  implies  that  W(z)  loses  rank  at 

Zq  and  hence  V(z)  cannot  be  expanded  in  a  power  series  with  decaying 

coefficients.  To  overcome  this  problem,  modified  versions  of  the  CVS  can  be 

s  s  s  t 

used  to  obtain  the  desired  sequence  representations  for  g^,  w^,  and  v^. 

For  simplicity,  the  case  where  G(Zq)  has  a  single  repeated  eigenvalue 
with  multiplicity  two  will  be  considered.  In  this  situation,  the 
eigenvalue/eigenvector  decomposition  of  G(Zq)  is  given  by: 

G(zq)  =  V*(z0)  J(zQ)  V*(zQ)  ....(A7.12) 

where  W*(zQ)  contains  an  appropriate  pseudo-eigenvector  so  that  W  (z^)  and 
*  * 

V  (Zq)  are  full  rank, 

«l(z0> 

0 

J<zo>  =  -  - 

and  Am-2(z0)  =  diag  Km(z0>)- 

From  eqn  A7.12,  it  is  clear  that  the  problem  of  W(z)  losing  rank  at  Zq  may 
be  avoided  simply  by  redefining  the  CWS  to  ensure  that  the  resulting  power 
series  representation  derived  from  the  CWS  contains  a  nonzero  element  in  the 
appropriate  off-diagonal  position. 

When  Zq*  *  0,  G(0)  will,  by  assumption,  have  a  simple  Jordan  form,  and 
the  desired  modifications  can  be  achieved  by  defining  the  CWS  in  the 
following  way: 

SA  =  {  A(0),  J(l),  A(2) ,  •••,  A(k) ,  •••  }  - (A7. 13) 

where  A(k)  =  diag  (g^k),  •••,  gm(k)} 
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and 


J(l)  =  A(l)  + 


0  1 
0  0 


0 


10  I  0  J 

With  the  modified  CWS  defined  in  this  manner,  J(z)  can  be  written  in  power 
series  form  as: 

,  05 

J(z)  =  A(0)  +  J(l)  z  +  L  A(i )  z  1  - (A7.14) 

i=2 

Hence,  the  diagonal  elements  of  J(z)  still  represent  the  power  series 
representations  for  the  characteristic  gains  of  G(z),  while  the  nonzero  off- 

diagonal  term  in  J(l)  ensures  that  J(Zq)  has  a  nonzero  term  in  the 

* 

appropriate  off-diagonal  position.  As  a  result,  W  (z)  will  be  full  rank  at 

*  ^ 

Zq,  and  decaying  sequence  representations  for  both  W  and  V  can  be 
obtained. 

Since  A(0)  in  eqn  A7.13  is  identical  to  that  used  by  the  standard  CSM 
algorithm,  the  initialization  procedure  for  this  modified  algorithm  will  be 
exactly  the  same  as  that  given  in  Appendix  7.1.  Furthermore,  the  algorithm 

defined  by  eqns  A7.1  and  A7.2  can  still  be  used  to  calculate  g^(k)  and  w^(k) 
provided  the  following  adjustment  is  made  to  the  second  column  of  the  matrix 
M(k): 

m2(k)  =  m2(k)  -  w*(k-l)  ....(A7.15) 

This  adjustment  is  required  to  ensure  that  the  convolution  relationship 
defined  by  eqn  7.1  remains  valid  for  the  modified  CWS.  Of  course,  the  above 
results  can  also  be  extended  to  the  problem  of  a  repeated  nonsimple 
eigenvalue  of  G(Zq)  with  multiplicity  q.  For  this  situation,  the  only 
required  modifications  to  the  standard  algorithm  are  given  by: 

nuCk)  =  m.  (k)  -  w^^k-l)  - (A7.16) 

The  modifications  identified  above  are  not,  however,  adequate  when 
Zq*  =  0  (i.e.  when  G(0)  has  repeated,  nonsimple  eigenvalues)  because  th" 

modified  CWS  defined  by  eqn  A7.13  cannot  account  for  the  nonsimple  Jordan 
structure  of  G(0).  For  this  special  situation,  more  complex  modifications 
are  required  as  described  below. 
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Case  3:  Nonsimple,  Repeated  Eigenvalues  of  G(0) 

When  A(z,g)  is  reducible  to  linear  factors  and  G(0)  has  repeated 
eigenvalues  associated  with  a  nonsimple  Jordan  form,  another  modified 
version  of  the  CSM  algorithm  can  be  established  to  produce  the  desired 
results  by  redefining  the  CWS  in  the  following  way: 

SA  =  {  J(0),  A(l),  A(2) ,  •••,  A(k) ,  •••  }  ....(A7.17) 

where  A(k)  =  diag  fg^k),  ••*,  gm(k)}  ; 


J(0)  = 


gj(0) 


1  I 

o  gj(0)  I 

I 


0 


’\  r* 

in-2 


and  Am_2(0)  =  diag  (g3(°) »  gm(0)}. 

(Note:  For  simplicity,  the  problem  of  a  single  repeated  eigenvalue  with 

multiplicity  two  will  again  be  highlighted,  although  the  results  can  be 
extended  to  other  situations.)  Unfortunately,  this  new  definition  of  the 
CWS  suggests  that  the  initialization  procedures  associated  with  the 
algorithm  will  be  affected  by  the  modifications.  Therefore,  a  modified 
procedure  must  ensure  not  only  that  appropriate  calculations  are  made  at 
each  step  but  also  that  an  appropriate  initialization  procedure  is  defined. 
For  the  CWS  defined  by  eqn  A7.17,  G(0)  can  be  rewritten  as: 

G(0)  =  W*(0)  J(0)  V*(0) 

*  ** 

where  w2(0)  and  v^(0)  are  defined  by: 

G(0)  w*(0)  =  gl(0)  w*(0)  +  w*(0) 

Vj(0)  G(0)  =  gj(0)  vx(0)  +  v2(0) 

so  that  W*(0)  and  V*(0)  are  full  rank,  and  the  following  convolution 

relationships  can  be  established: 

k  k-1 

E  G(k-i)  W*(i)  =  E  W*(i)  A(k-i)  +  W*(k)  J(0)  - (A7.18) 

i=0  i=0 

Thus,  the  standard  CSM  algorithm  may  still  be  used  to  calculate  sg^  and  swt 


for  i  =  3, 


provided  the  appropriate  generalized  inverse  (described 
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However,  further  modifications  are  required  when 


in  l LI 1 ] )  is  used, 
computing  sg^  and  swt  for  i  =  1,  2. 

For  further  ease  of  presentation  in  the  developments  to  follow,  only 

the  case  where  m  =  2  will  be  considered.  The  set  of  equations  defined  by 

A7.18  then  reduces  to: 

k.  k. 

£  G(k-i)  wn(k)  =  £  g.(k-i)  w^(k)  ;  k  >  0  ....(A7.19a) 

i=0  1  i=0  1  1 

k  k 

£  G(k-i)w*(i)  =  £  g9(k-i)wt(i)  +  g..(0)w9(k)  +  w*(k);  k  >  0  ....(A7.19b) 

i=0  Z  i=l  Z  Z  1  1  1 

Now,  gj(0),  w^(O),  and  w2(0)  can  be  calculated  using  standard  techniques  on 

G(0) .  The  problem  of  computing  g^l),  g2(l),  w.^1)  and  w2(l)  can  then  be 

addressed  by  examining  eqns  A7.19a,b  when  k=l.  In  particular,  multiplying 
*t  *t 

through  by  v.^0)  and  v2(0),  respectively,  yields: 

v*(0)  G(l)  w*(0)  +  v*(0)  w*(l)  =  gl(l)  . . . . (A7 . 20a) 

v*(0)  G ( 1 )  w*(0)  +  v*(0)  w*(l)  =  v*<0)  w*(l)  . . . . (A7 . 20b) 


v*(0)  G(l)  w*(0)  =  0 


. . (A7 . 20c) 


v*(0)  G( 1 )  w*(0)  =  g2(l)  +  v*(0)  w*(l) 


. .(A7.20d) 


But  these  four  equations  generate  only  3  constraints  on  the  4  desired 
unknowns.  Eqn  A7.20c  simply  implies  that  v*(0)  is  also  an  eigenvector  of 
G(l).  The  necessary  fourth  constraint  can,  however,  be  obtained  by 

proceeding  to  k  =  2  and  multiplying  through  eqn  A7.19a  by  V£ ( 0 )  to  produce: 

v*<0)  G ( 2 )  w*(0)  +  v*(0)  G( 1)  w*(l)  =  gl(l)  {v*(0)  w*(l)}  ....(A7.21) 

*  £ 
g^(l)  and  w^l)  can  now  be  identified  in  the  following  manner.  Since  w^(0) 

and  w2(0)  span  the  two-dimensional  space,  w^(l)  can  be  written  as  a  linear 


combination  of  these  vectors: 


w*(l)  =  «*j(l)  w*(0)  +  3j(  1 )  wj(0) 


. . . (A7.22) 


Substituting  this  expression  in  eqns  A7.20a  and  A7.21  generates  the 
following  equation  for  3^(1): 


? 


0^(1)  +  {v*(0)G(l)w*(0)-v*(0)G(l)w*(0)}ei(l)  -  v*(0)G(2)w*(0)  =  0....(A7.23) 

Once  P^(l)  has  been  computed,  the  remaining  unknowns  can  be  identified  using 

eqns  A7.20a,b,d.  Notice  that  the  value  for  o^(l)  e<ln  A?.22  will  not 

affect  any  of  the  othei  unknown  quantities,  and  hence,  can  be  arbitrarily 

specified  (or,  for  convenience,  set  to  zero).  It  is  also  important  to  note 

that  both  solutions  for  0^(1)  identified  by  eqn  A7.23  are  valid.  Since 

g2(0)  =  8i<0)t  the  selection  of  either  solution  will  simply  identify  the  two 

s  s 

distinct  sequences,  g^  and  g2> 

Once  w^(l)  and  g^l)  are  known,  the  calculation  of  w.(k)  and  gj(k)  for 
i  =  1,  2  can  be  accomplished  using  the  following  procedures: 

(1)  Form  nu(k)  as  defined  by  eqn  A7.1. 

*  k-1  * 

(2)  Form  m.(k+l)  =  G(k+1)  w.(0)  +  G(k)  w.'(l)  +  Z  G(k-j+l)  w.(j). 

i  i  1  j=2  i 

(3)  Let  w*(k)  =  ^(k)  w*(0)  +  0^(k)  w*(0)  where  ^(k)  is  arbitrary, 

0^(k)  is  defined  by: 


3l(k)  =  {gi(i)  +  0i(i)_‘X2(i))  <v2(0)  m1(k+D  ~  Pl(1)  vl(0)  ml(k)}’ 


using  the  following 


and  X2(l)  =  v2(0)  G( 1 )  w2(0). 

(4)  Calculate  g1(k),  g2(k)  and  w*(k) 

relationships: 

gj(k)  =  v^O)  m^k)  +  0j(k) 
g2(k)  =  v2(0)  m2(k)  -  e^k) 

w*(k)  =  -w*(0)  v*(0)  (m2(k)  -  w*(k)}  +  a^k)  w*(0) 

where  t%2(k)  is  an  arbitrary  constant  (which  may,  for  convenience, 
be  set  to  zero). 

Although  the  above  procedure  highlights  the  development  when  m  =  2,  it  can 
also  be  extended  to  higher  dimensions.  The  details  of  these  additional 
developments  will  not,  however,  be  presented  here. 


CHAPTER  EIGHT 


T 


A  MULTIVARIABLE  GENERALIZATION  OF  PREDICTIVE  SELF  TUNING  CONTROL 

The  characteristic  subsystem  decomposition  developed  and  investigated 
in  the  previous  chapter  establishes  an  important  link  between  the  frequency- 
domain  design  of  multivariable  control  systems  and  the  time-domain 
implementation  of  the  resulting  design.  As  demonstrated  in  Sections  7.4  and 
7.5,  the  subsystem  representations  can  be  used  to  simplify  the  off-line 
design  of  multivariable  compensators  significantly  by  completely  reducing 
the  multivariable  control  problem  to  a  set  of  independent  SISO  problems. 
They  also  hold  the  key  to  the  development  of  on-line,  computer-implemented 
control  algorithms  which  account  for  the  multivariable  nature  of  the  system 
within  a  true  generalized-Nyquist  framework. 

Ar  highlighted  in  Chapter  1,  a  number  of  successful  SISO  control 
algorithms  have  been  developed  for  use  in  on-line  self-tuning  applications 
where  the  resulting  control  law  is  combined  with  on-line  identification  of 
the  model  parameters  of  the  plant.  Each  of  these  SISO  algorithms  relies  on 
a  system  description  that  can  be  embedded  in  an  on-line  computer  procedure 
to  generate  appropriate  controller  outputs.  For  SISO  systems,  such 
descriptions  are  readily  available.  However  when  MIMO  systems  are 
considered,  the  additional  complexities  introduced  by  the  multivariable 
characteristics  of  the  problem  make  the  development  of  an  appropriate 
description  much  more  challenging.  Indeed,  past  efforts  have  resorted  to 
methods  which  either  attempt  to  eliminate  these  multivariable  characteris¬ 
tics  completely  by  decoupling  the  dynamics  of  the  open-loop  system  or, 
alternatively,  consolidate  the  available  multivariable  information  into  a 
single  scalar  measure  thereby  addressing  the  fundamental  problem  of 
multivariable  interaction  in  only  an  indirect  manner.  But  now,  with  the 
characteristic  subsystem  descriptions  developed  previously,  the  multivari¬ 
able  problem  can  be  decomposed  into  a  set  of  independent  SISO  problems 
within  a  time-domain,  rather  than  the  more  conventional  frequency-domain, 
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setting.  The  result  is  a  complete  multivariable  system  description  which 
can  be  used  in  conjunction  with  existing  on-line  SISO  control  algorithms  to 
yield  effective  multivariable  generalizations  of  these  algorithms. 

A  particularly  convenient  SISO  algorithm  for  this  application  is  the 
Generalized  Predictive  Control  (GPC)  algorithm  proposed  by  Clarke  et  al 
(CLA3],  [CLA4J.  Though  based  on  a  difference  equation  model  of  the  plant, 
this  algorithm  relies  on  explicit  knowledge  of  the  first  several  elements  of 
the  corresponding  weighting  sequence  to  establish  an  optimal,  on-line 
control  law.  This  particular  characteristic  of  SISO  GPC  provides  an 
immediate  and  important  link  to  the  CVS  that  can  be  exploited  when 
developing  an  implementable  algorithm  for  multivariable  systems. 

The  goal  of  this  chapter  is  to  integrate  the  SISO  GPC  approach  into  the 
characteristic  subsystem  framework  and  to  demonstrate  the  utility  of  the 
resulting  algorithm  for  self-tuning  applications  involving  multivariable 
systems.  The  development  begins  by  reformulating  the  GPC  control  law  in 
terms  of  the  characteristic  subsystem  descriptions  developed  in  Chapter  7. 
The  result  is  a  predictive  control  law  based  on  the  identification  of 
independent  controls  for  each  subsystem.  Next,  the  self-tuning  requirement 
for  on-line  identification  of  appropriate  model  parameters  is  addressed.  It 
is  suggested  that  standard  identification  algorithms  can  be  combined  with 
the  standard  CSM  algorithm  (Appendix  7.1)  to  obtain  the  required  subsystem 
descriptions.  However,  more  computationally-ef f icient  algorithms  for  the 
direct  identification  of  the  subsystems  are  also  proposed.  Finally,  a  brief 
summary  of  the  computational  requirements  of  the  algorithm  is  presented,  and 
the  chapter  concludes  with  simulation  results  which  demonstrate  the 
effectiveness  and  flexibility  of  the  proposed  algorithm. 

8.1  Long-Range  Predictive  Control  Using  Subsystem  Descriptions 

In  this  section,  a  multivariable  generalization  of  SISO  GPC  will  be 
developed  using  the  characteristic  subsystem  descriptions  developed 
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previously.  Although  a  weighting  sequence  description  of  the  plant  is  used 
in  the  initial  development  of  the  subsystem  control  laws  (producing  results 
with  obvious  connections  to  Dynamic  Matrix  Control  [CUT1]),  the  results  will 
also  be  extended  to  systems  described  by  difference  equation  models.  For 
this  reason,  attempts  will  be  made  (as  far  as  possible  throughout  the 
development)  to  keep  the  notation  consistent  with  that  of  the  SISO  GPC 
algorithm  presented  in  (CLA3J. 

8.1.1  Plant  Models  and  Output  Prediction 

Consider  the  multivariable  system  described  by  the  input-output 
relationship 

y(t)  =  G  u(t)  +  (1/A)  £(t)  ....(8.1) 

where  y,  u,  and  £  are  [m  x  1]  output,  input,  and  noise  vectors  respectively 

defined  at  time  t,  the  elements  of  ^  are  assumed  to  be  uncorrelated  with 

zero  mean,  G  is  a  polynomial  matrix  in  powers  of  z~*,  and  A  =  1  -  z~ *  is  a 

scalar.  [From  this  point  forward,  bold  print  will  be  used  to  denote 
operators  on  time-domain  information. J  As  suggested  in  the  previous 
chapter,  G  may  be  written  in  eigenvalue/eigenvector  form  as: 

G  =  V  A  V  ....(8.2) 

and  since  V  =  W  premultiplication  of  eqn  8.1  by  V  A  yields  the  modified 
input/output  relationship: 

A  y(t)  =  A  A  u(t)  +  £,(t)  ....(8.3) 

where  y(t)  =  V  y(t);  u(t)  =  V  u(t);  £(t)  =  V  £(t)  - (8.4) 

In  effect,  this  transformation  constitutes  a  projection  (in  i 
"convolutional"  sense)  onto  the  eigenvector  frame  and  reduces  the 
multivariable  description  of  the  system  td  a  set  of  m  scalar  models  each 
described  by  a  relationship  of  the  form: 

Ay^t)  =  gj  fiUjtt)  +  ijd)  - (8.5) 

This  set  of  subsystem  descriptions  provides  the  foundation  for  generalizing 
the  SISO  GPC  algorithm  to  multivariable  systems. 

To  establish  the  appropriate  controls,  consider  first  the  problem  of 
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predicting  future  system  outputs.  From  eqn  8.3,  the  system  outputs, 
referred  to  the  eigenvector  frame,  at  a  time  k  samples  in  the  future  are 
given  by: 

y(t+k)  =  y(t+k-l)  +  A  A  u(t+k)  +  V  £(t+k)  ....(8.6) 
Now  using  similar  relationships  for  y(t+k-l),  y(t+k-2),  •••  ,  y(t+l),  eqn 
8.6  can  be  rewritten  as: 


y( t+k)  =  y(t)  +  (  E  z  A  A  u(t+k)  +  (  E  z'j)  V  £(t+k)  - (8.7) 

k  j=0  '  K  j=0  } 

By  defining  Ek  and  Nk  as  matrix  series  in  powers  of  z  ^  such  that 

k-i  _  .  k-1  ..  -  .  - 

E,  =  Z  E,  (j)  z  J  and  I  z  J  V  =  E.  +  z  K  N.  - (8.8) 

*  j=0  k  j=0  1  *  K 

the  portion  of  y(t+k)  due  entirely  to  future  values  of  noise  can  be 
isolated.  The  best  available  estimate  of  y(t+k)  at  time  t  is,  therefore, 
given  by: 

k-1 

y( t+k 1 1 )  =  y ( t )  +  J  E  z  J  }  A  A  u(t+k)  +  N.  £(t)  - (8.9) 

V  3=0  ’  * 

Furthermore  from  eqn  8.1, 

£(t)  =  A  y(t)  -  G  A  u(t)  =  V  A  y(t)  +  V  A  A  u(t)  - (8.10) 


and  this  relationship  can  be  introduced  into  eqn  8.9  to  produce: 


y( t+k | t ) 


{I  +  Nfctf  A)  y(t)  + 


W  A  A  u( t+k) 


or,  following  the  notation  in  |CLA3], 

y ( t+k | t )  =  W  y(t)  +  A  u(t+k)  =  Fk  y(t)  +  Gfc  A  u(t+k)  - (8.11) 

where  Pfc  =  (V  +  NfcA),  G^  =  Ek»  A,  Gk  =  Efc  G  - (8.12) 

Finally,  eliminating  Nk  from  eqns  8.8  and  8.12  yields  the  Diophantine 
identity: 

V  =  E,^  A  +  z'k  Fk  - (8.13) 

from  which  Fk  and  Ek  can  be  identified.  For  this  eigenvector  weighting 
sequence  formulation  of  the  problem,  the  solution  of  eqn  8.13  is  given 
by: 


j 

E.(  j )  =  E  V(i);  j  =  0,  •••  ,  k-1 

K  i=0 

-  k 

F.  (0)  =  E  V(i)  F.  (j)  =  V(j+k);  j  >  0 

K  i=0  K 
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Using  the  results  above,  it  is  easy  to  demonstrate  that  the  first  k 

"  f 

elements  of  are  given  by: 

j 

Gt(j)  =  E  A(i);  j  =  1,  •••  ,  k  ....(8.14) 

K  i=l 

t  h  s 

where  A(i)  denotes  the  i  element  of  A  (defined  in  Chapter  7).  Since  each 
term  in  eqn  8.14  is  diagonal,  eqn  8.11  indicates  that  each  predicted 
subsystem  output,  {y^(t+k|t);  k  >  0} ,  is  directly  related  only  to  known 
quantities  and  to  the  unknown  scalar  inputs,  {u^(t+j);  0  <  j  <  k) , 

“■  f 

associated  with  the  same  subsystem.  Furthermore,  the  first  k  terms  of  A 
are  given  by: 

ttyj)  =  A(j);  j  =  1,  •••  ,  k  ....(8.15) 

So,  the  relationship  between  y^(t+k|t)  and  (u^(t+j);  0  <  j  <  k} ,  is 
completely  defined  by  the  first  k  elements  of  the  appropriate  CWS.  On  this 
basis,  the  multivariable  self-tuning  control  problem  can  be  reduced  to  a  set 
of  scalar  problems  as  shown  in  Figure  8.1,  and  a  predictive  control  law  for 
each  scalar  subsystem  can  be  generated  using  the  appropriate  CWS.  The 

result  is  a  multivariable  generalization  of  SISO  predictive  self-tuning 
control  in  a  true,  generalized-Nyquist  framework. 

Before  proceeding  to  this  control  law  development,  it  must  be  noted 

that  the  output  predictions  and  Diophantine  identity  defined  by  eqns  8.11 
and  8.13  were  developed  using  a  weighting  sequence  description  of  the  system 
transfer  function  matrix.  If  instead,  a  difference  equation  model  of  the 
form: 

A  A  y(t)  =  BA  u  ( t )  +  f.(t)  - (8.16) 

was  assumed,  the  predicted  outputs  in  the  eigenvector  frame  would  still 

"  9  ** 

satisfy  the  relationships  defined  by  eqn  8.11  provided  Gk  =  V  V  A, 
=  EjJB  and  and  are  defined  by  the  modified  Diophantine  identity: 

V  =  ^  A  A  +  z'k  Fk  ....(8.17) 

Furthermore,  y^(t+k|t)  would  still  be  directly  related  only  to  known 

quantities  and  to  the  unknown  scalar  inputs  {Uj(t+j);  0  <  j  <  k) .  So,  the 
multivariable  control  problem  could  still  be  reduced  to  m  scalar  problems. 
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8.1.2  The  Predictive  Control  Lav 


k 


Assuming  that  a  sequence  of  future  set  point  or  reference  vectors 
(c(t+k);  k  >  0}  are  available,  these  quantities  may  be  projected  onto  the 
eigenvector  frame  using  the  relationship 

c(t+k)  =  V  c(t+k)  ....(8.18) 
to  establish  the  scalar  subsystem  set  point  sequences,  (Cj(t+k),  k  >  0} . 
Appropriate  controls  can  then  be  identified  separately  for  each  individual 
subsystem  using  the  basic  GPC  algorithm  presented  in  (CLA3).  In  particular, 
the  future  controls  for  the  i1^  subsystem  can  be  chosen  to  minimize  the  cost 
function 


J.  =  E{  I  ly.(Uj)  -  c.(t+j))2  +  I  p.(j)  Au?(t+j)}  ....(8.19) 

1  j=Nj  1  1  j=0 

where  is  the  minimum  prediction  horizon,  Nj  is  the  maximum  prediction 

horizon,  p^(j)  is  a  sequence  of  control  weights  (for  simplicity  in  the 


following  development,  p^(j)  will  be  assumed  to  be  constant  over  all  j  and 
will  be  set  equal  to  p^),  and  the  expectation  in  eqn  8.19  is  conditioned  on 
data  up  to  time  t  assuming  no  future  measurements  are  available.  Using  eqn 


8.11,  y^(t+k|t)  can  be  written  in  terms  of  future  inputs  and  known  data  as: 

k-1 

y.(t+k|t)  =  E  g • (k- j )  Au . ( t+j )  +  r.(k)  - (8.20) 

j-0 

k 

where  g.(k)  =  E  g.(j),  - (8.21a) 

1  j=l 

-  _  t  _t  k-1 

r .  (k)  =  y(t)  +  g^  Au(  t+k)  -  zk_1  ({  E  g.(j+l)  z-j}  Au.(t)\ - (8.21b) 

ii  i  v  j=0 

~k  *  ~k^  th  ~ 

and  fj,  gj  are  the  i  row  vectors  of  and  G^  respectively.  [Note:  The 

"  t 

fact  that  the  first  k  elements  of  are  diagonal  has  been  used  implicitly 
to  establish  the  expressions  for  y^(t+k|t)  and  r^(k)  given  by  eqns  8.20  and 
8.21b,  respectively.]  From  eqn  8.20,  the  predicted  outputs  y^(t+j|t)  for 
j  =  Nj,  •••  ,  N2  can  now  be  written  in  vector  form  (with  a  slight  abuse  of 


notation)  as: 


where 


yi(Nl’N2)  =  Gi  +  ri 

y.(NrN2)  =  [  yi(t+N1|  t)  •••  y.(t+N2|t)  j1, 


.(8.22) 


Auj  =  l  Au^(t) 


Au.(t+N2-1)  J1, 


ri  =  [  r.(l)  •••  ri(N2)  ] 


and  6^  is  a  matrix  of  dimension  [(N2~  N^+  1)  x  N2 ]  given  by: 


gjfNj)  •  •  •  gj(l) 

gi<N2) 


Jj  (defined  by  eqn  8.19)  can  now  be  written  as: 

JA  =  {  (G.  Au.  +  r.  -  c.)*  (Gj  Au.  +  r.  -  c.)  }  +  p.  AUj  AUj 

where  c^=  (Cj(t+Nj)  •••  c^(t+N2)]t  is  the  vector  of  future  set  points  for 
the  specified  subsystem,  and  minimizing  with  respect  to  the  vector  Au^ 
(assuming  no  constraints  on  the  future  controls)  generates  the  desired 
incremental  control  law  given  by: 
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_ (8.23) 


*i  =  {  Gi  G.  +  p.I  } 


1  -T  ~ 

1  g!  (c.  -  r.) 

l  v  l  i' 


The  current  control  increment  for  subsystem  i  (Au^(t))  is,  then,  the  first 
element  of  the  vector  Au^. 


8.1.3  Implementing  the  Control  Law 

An  implementation  diagram  for  the  predictive  control  algorithm 
developed  in  the  previous  sections  is  displayed  in  Figure  8.2.  As  suggested 
by  this  diagram,  the  implementation  can  be  conveniently  separated  into  three 
distinct  tasks: 

(1)  prediction  of  future  subsystem  outputs  over  the  specified  prediction 
horizons, 

(2)  identification  of  incremental  controls  for  each  subsystem,  and 

(3)  transformation  of  the  identified  subsystem  controls  to  produce  the 
actual  system  controls  (which  are  to  be  applied  to  the  plant). 

Assuming  that  V,  V,  and  the  CWS  are  available,  each  of  these  tasks  can  be 
summarized  as  follows: 


Task  1:  Subsystem  Output  Prediction 

From  eqn  8.23,  it  is  clear  that  the  first  step  in  calculating  the 
appropriate  incremental  controls  is  the  identification  of  r,  the  vector  of 
future  output  predictions  based  on  past  known  input/output  data.  One  method 
of  generating  this  information  is  suggested  by  the  previous  discussion.  In 
particular,  the  appropriate  Diophantine  identity  (eqn  8.13  or  8.17)  can  he 
used  to  solve  for  E^  and  F^.  If  a  weighting  sequence  model  is  used,  Hu 
solution  (as  mentioned  previously)  is  obvious.  If  a  difference  equation 
model  is  used,  a  recursive  algorithm  based  on  a  simple  matrix  extension  of 
the  scalar  algorithm  proposed  in  (CLA3J  can  be  used  to  accomplish  the  task. 
Using  E^,  F^,  and  G  (or  A,B),  the  prediction  of  future  subsystem  outputs  due 
to  past  available  information  can  then  be  computed  as  shown  in  eqn  8.21b. 

For  presentation  purposes,  the  use  of  Diophantine  identities  provides  a 
convenient  means  of  describing  the  relationship  between  future  outputs  and 
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past,  present,  and  future  information.  In  practice  however,  it  is  possible 
to  predict  the  future  outputs  using  an  alternative  and  computationally  more 
efficient  procedure.  Indeed,  simulation  techniques  are  typically  used  in 
the  SISO  GPC  algorithm  to  generate  this  information,  and  the  same  approach 
can  be  extended  to  the  multivariable  problem.  In  the  absence  of  Information 
on  the  future  noise,  £(t+k)  can  be  assumed  to  be  zero  and  system  output  at 
time  t+k  is  described  by: 

y(t+k)  =  y(t+k-l)  +  G  Au(t+k)  ....(8.24a) 


y(t+k)  =  y( t+k-l)  +  (I  -  A)  Ay(t+k-l)  *  B  Au(t+k)  - (8.24b) 

Eqn  8.24  defines  input/output  relationships  which  can  be  used  in  a 
simulation  algorithm  (using  available  input/output  information)  to  establish 
the  desired  output  predictions  in  the  plant  frame.  The  corresponding 
eigenframe  predictions  can  then  be  generated  by  transforming  these  plant- 


Figure  8.2:  Implementation  of  the  Multivariable  Predictive  Control  Algorithm 
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frame  predictions  in  a  manner  analogous  to  that  for  the  set  point 
information  (eqn  8.18). 

One  further  point  concerning  the  computation  of  r  using  simulation 
techniques  (as  suggested  above)  must  also  be  highlighted.  Since  the 
proposed  control  lav  (eqn  8.23)  is  defined  in  the  eigenvector  frame,  it  must 
be  assumed  the  future  control  increments  in  the  eigenvector  frame  are  zero. 
This  does  not  imply  that  the  corresponding  future  plant-frame  control 
increments  (used  in  eqn  8.21b  or  eqn  8.24)  are  zero.  In  fact,  future  plant- 
frame  controls  are  defined  by: 

Au(t+k)  =  V  Au(t+k)  ;  k  >  0 

and  mav  be  non-zero  depending  on  the  length  of  V  and  the  value  of  k.  These 
non-zero  future  control  increments  must  be  included  in  the  calculation  of  r 
(when  using  plant  frame  information)  to  obtain  the  proper  results. 

Task  2:  Incremental  Subsystem  Control  Identification 

Using  specified  plant-frame  reference  signals,  corresponding  subsystem 
reference  vectors,  c,  can  be  obtained  by  transforming  the  plant-frame 
signals  as  shown  in  eqn  8.18.  The  appropriate  incremental  control  can  then 
be  calculated  using  c,  r,  and  the  control  lav  described  by  eqn  8.23. 

Task  3:  Control  Transformation  and  Application 

The  incremental  subsystem  controls  generated  by  Task  2  at  time  t, 
(AUj(t);  i=l,  •••  ,  m} ,  can  be  transformed  into  the  plant  frame  using  the 
vector  relationship: 

Au(t)  =  V  Au(t)  - (8.25) 

where  Au(t)  =  (Au.(t)  •••  Au  (t)J*  and  Au(t)  =  |Au,(t)  •••  Am  ( t ) J t  denote 
1  m  i  m 

the  vectors  of  incremental  controls  in  the  plant  and  eigenvector  frames 
respectively  at  time  t.  The  control  vector  to  be  applied  to  the  plant  is, 
therefore,  given  by: 

u(t)  =  u(t-l)  +  Au( t )  - (8.26) 

As  in  the  scalar  case,  the  applied  control  includes  integral  action.  It 
should  also  be  noted  that,  for  this  implementation,  the  integral  action  is 
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brought  specifically  to  a  point  past  W  to  keep  the  length  of  the  truncated 
sequences  to  a  minimum. 

The  fundamental  characteristic  of  the  multivariable  predictive  control 
algorithm  proposed  above  is  that  it  provides  independent  control  over  each 
scalar  characteristic  subsystem  of  the  plant.  Indeed,  the  incremental 
control  law  for  each  subsystem  (eqn  8.23)  is  identical  to  the  scalar  control 
law  associated  with  SISO  GPC.  As  a  result,  the  user  has  precisely  the  same 
flexibility  in  the  selection  of  control  parameters  for  each  subsystem  as 
that  available  in  the  standard  SISO  GPC  control  algorithm.  More 
specifically,  minimum  and  maximum  prediction  horizons  (Nj,  ^)  and  control 
horizons  (N^)  can  be  selected  independently  for  each  subsystem  using 
considerations  similar  to  those  presented  in  [CLA3J  for  the  choice  of  these 
parameters.  In  addition,  the  refinements  to  SISO  GPC  identified  in  [CLA4J 
can  also  be  applied  independently  to  each  subsystem.  An  important 
implication  of  the  characteristic  decomposition  results  presented  in  this 
chapter  is,  therefore,  that  the  user  can  now  implement  direct  control  over 
each  individual  subsystem.  In  effect,  the  proposed  algorithm  establishes 
appropriate  controls  for  each  of  the  m  loops  of  the  multivariable  system 
directly  without  regard  to  "correct"  input/output  pairings. 

An  assumption  implicit  throughout  the  development  above  is  that 
effective  loop-by-loop  control  of  multivariable  systems  can  be  achieved  via 
control  of  the  characteristic  subsystems.  The  validity  of  this  assumption 
can  be  substantiated  by  returning  to  the  characteristic  locus  framework  from 
which  the  proposed  algorithm  was  derived.  Two  points,  in  particular,  must 
be  highlighted.  First,  the  analyticity  of  the  g.(z)  implies  a  one-to-one 
correspondence  between  the  closed-loop  poles  of  the  system  and  the  proximity 
of  the  characteristic  loci  to  the  critical  point  (-1,0)  in  the  complex 
plane.  As  a  result,  good  subsystem  performance  (as  defined  by  g^)  implies 
and  is  implied  by  good  loop  performance. 
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Second,  the  difference  between  individual  loop  performance  and 
corresponding  subsystem  performance  may  be  quantified  in  terms  of  bounds 
that  depend  on  the  frequency-dependent  eigenfunctions  of  the  open-loop 
system  and  the  angular  misalignment  between  the  frequency-dependent 
eigenvectors  and  the  standard  basis  vectors  [MACl ] , [ K.0U8 ] .  Furthermore,  it 
can  be  shown  that  these  bounds  become  negligible  (and  hence,  the  deviations 
between  loop  and  subsystem  performance  become  insignificant)  as  the  closed- 
loop  eigenfunctions  become  equal  and/or  the  misalignment  angles  become  small 
over  the  entire  range  of  frequencies  from  0  to  n/T.  At  low  frequencies, 
integral  action  ensures  high  ooen-loop  gains  in  all  loops,  so  that  existing 
differences  between  the  closed-loop  eigenfunctions  are  necessarily  small. 
Hence,  the  relative  deviation  between  loop  and  subsystem  performance  for  low 
frequency  inputs  is  guaranteed  to  be  small. 

At  high  frequencies  however,  it  is  not  generally  possible  to  generate 
high  gains  because  of  stability  considerations.  At  these  frequencies 
therefore,  alternative  methods  must  be  used  to  maintain  small  deviations 
between  loop  and  subsystem  performance.  A  useful  solution  to  this  problem 
is  the  introduction  of  precompensation  which  aligns  the  high-frequency 
eigenvectors  of  the  closed-loop  system  to  the  standard  basis  vectors.  Since 
the  eigenvectors  of  the  open-  and  closed-loop  systems  are  the  same,  this 
task  can  be  achieved  using  a  constant  real  precompensator  which  accurately 
approximates  the  column  structure  of  G~*{e^w^}  at  appropriately  high 
frequencies.  Indeed  for  discrete  systems,  a  compensator  of  the  form: 

Kh  =  G-1{eJn}  =  G— 1 ( - I ) 

is  a  simple  candidate  since  G(-l)  will,  in  general,  be  invertible.  In 
instances  when  G(-l)  is  rank  deficient,  application  of  the  ALIGN  algorithm 
[K0U1]  at  a  slightly  lower  frequency  will  produce  a  suitable  precompensator. 
Using  this  compensation,  the  angular  misalignment  between  the  high  frequency 
eigenvectors  and  the  standard  basis  vectors  can  be  reduced  significantly 
and,  hence,  the  relative  deviation  between  loop  and  subsystem  performance 
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during  fast  transients  will  also  be  small.  Furthermore,  unless  the  system 
is  characterized  by  rapidly  changing  eigenvectors  (as  might  occur  in  systems 
where  resonant  phenomena  are  associated  with  individual  inputs  and/or 
outputs),  the  high  and  low  frequency  behaviour  described  above  will  carry 
over  to  intermediate  frequencies  ensuring  small  deviations  over  this  range 
of  frequencies  as  well.  Thus  for  most  practical  problems,  the  deviation  of 
loop  performance  from  subsystem  performance  is  guaranteed  to  be  small  for 
all  inputs.  This  result,  combined  with  the  analyticity  argument  above, 
suggests  that  adequate  loop-by-loop  performance  can  indeed  be  successfully 
achieved  via  independent  control  of  the  individual  characteristic 
subsystems. 

Yet  another  argument  that  substantiates  the  close  correlation  between 
loop  and  subsystem  performance  is  the  following:  uniform  control  design 
objectives  (described  by  eqn  8.19)  for  the  individual  characteristic 
subsystems  will  produce  subsystem  controls  (eqn  8.23)  which  tend  to  balance 
the  eigenfunctions  of  the  open-loop  system  and,  hence,  to  match  the  closed- 
loop  eigenfunctions.  But  matching  the  closed-loop  eigenfunctions,  as 
explained  above,  reduces  the  bound  on  the  relative  deviation  between  loop 
and  subsystem  performance.  In  many  instances  then,  the  desired  correlation 
between  loop  and  subsystem  performance  may  be  achieved  simply  by  implement¬ 
ing  the  predictive  control  law  of  Section  8.1.2  without  resorting  to  high 
frequency  alignment  compensation.  Indeed,  the  simulation  results  presented 
in  Section  8.6  demonstrate  that  low  interaction  can  be  achieved  using  the 
proposed  algorithm  without  high  frequency  alignment  compensation. 

8.2  Indirect  Identification  of  the  CVS  and  CVS 

As  suggested  above,  the  predictive  control  law  defined  by  eqn  8.23  can 
be  implemented  in  a  nonadaptive  manner  using  prior  knowledge  of  the  system 
transfer  function  matrix  and,  hence,  the  CVS,  CVS,  and  dual  CVS.  However 
when  on-line  system  identification  is  required  for  adaptive  applications, 
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appropriate  identification  techniques  must  now  include  the  ability  to 

generate  the  required  CVS  and  CVS  information.  As  shown  in  this  section  and 

the  next,  there  are  two  ways  to  accomplish  this  task. 

The  first  alternative  is  to  use  existing  recursive  estimators  to 

identify  the  open-loop  system  model  and,  then,  to  use  the  standard  CSM 

algorithm  described  in  Chapter  7  to  generate  the  necessary  elements  of  the 

CVS  and  CVS.  It  is  important  to  note  that  the  implementation  of  the 

standard  CSM  algorithm  relies  on  the  convolution  relationship 

G  *  w.  =  g.  *  w. 

1  6i  1 

Hence,  a  weighting  sequence  description  of  the  open-loop  plant  must  be 
available  when  using  this  algorithm.  The  elements  of  the  desired  weighting 
sequences  can,  in  fact,  be  established  in  one  of  two  ways. 

First,  G(z)  may  be  parameterized  as  a  finite  weighting  sequence  of  the 

form: 

NG 

G(z)  =  t  G(i)  z'1 

i  =  l 

and  standard  recursive  algorithms  can  be  used  to  identify  the  elements 
directly.  Alternatively,  G(z)  may  be  formulated  in  difference  equation  form 
as: 

G(z)  =  A(z)-1  B(z) 

and,  again,  standard  recursive  estimators  can  be  used  to  identify  the 
polynomial  matrices  A  and  B.  When  the  difference  equation  formulation  is 
used  with  the  standard  CSM  algorithm,  the  required  weighting  sequence 
elements  can  be  generated  by  simulating  the  system  response  for  the  model 
A  y(t)  =  B  u(t)  starting  from  zero  initial  conditions  and  applying  an 
impulse  to  each  of  the  inputs  in  turn.  However,  it  should  also  be  noted 
that,  for  difference  equation  models,  a  potentially  more  efficient  algorithm 
for  computing  the  CVS  and  CVS  directly  from  A  and  B  can  be  developed  via  an 
extension  of  the  standard  CSM  algorithm.  The  fundamentals  of  this  extension 
are  highlighted  in  Appendix  8.1. 

The  selection  of  an  appropriate  multivariable  model  to  accomplish  the 
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tasks  described  above  depends  on  a  number  of  factors.  Among  the  most 
important  is  the  number  of  parameters  to  be  estimated  as  this  requirement 
directly  affects  the  computation  time  associated  with  the  identification 
algorithm.  For  multivariable  systems,  the  number  of  model  parameters 
required  is  a  function  of  the  number  of  poles  associated  with  each 
individual  element  of  G(z),  the  location  of  these  poles,  and  the  dimension 
of  the  system.  So  unlike  the  standard  SISO  problem,  the  lengths  of  A  and 
B  may,  in  certain  cases,  be  comparable  to  the  length  of  SG.  Furthermore, 
the  overall  computational  complexity  of  the  self-tuning  control  algorithm 
also  depends  on  the  computation  time  associated  with  the  particular  control 
implementation  selected,  a  quantity  which  can  vary  considerably  depending  on 
the  model  selected.  For  these  reasons,  the  problem  of  multivariable  model 
selection  to  achieve  efficient  computational  implementations  is  not  as  well- 
defined  as  in  the  scalar  case  and,  indeed  in  some  situations,  the  selection 
of  a  weighting  sequence  model  may  produce  computational  advantages.  Further 
computational  considerations  are  examined  in  more  detail  in  Section  8.5. 

8.3  Direct  Identification  of  the  CVS  and  CVS 

The  intermediate  task  of  identifying  G(z)  prior  to  generating  the  CVS 
and  CVS  can  be  eliminated  provided  a  method  exists  to  estimate  the  CVS  and 
CVS  directly  from  the  available  input/output  information.  Returning  to  eqn 
8.3  and  assuming  a  noise-free  environment,  the  relationship  between  y^  and 
Uj  is: 

v[  y(t)  =  g.  vj  u(t)  - (8.27) 

where  vj  denotes  the  i**1  row  of  V.  In  convolutional  terms,  eqn  8.27  can  be 
written  as: 

sv|  *  sy  =  sg.  *  sv|  *  su  ....(8.28) 

and  this  relationship  can  be  expanded  into  matrix  form  (with  a  slight  abuse 
of  notation)  to  produce: 

Vi  y  =  G*  Vj  u  - (8.29) 

u  =  |  ul(0)  •••  u*(N-l)  j* 


where  y  *  l  y*(l)  •••  y*(N)  1* 


V.  = 
1 


v;<o) 

vj(l) 


vJfO)^ 


G.  = 
1 


gjd)  \ 

( 2 )  g.(l) 


and  N  is  the  number  of  input/output  measurements  available.  Under  the 

s  t  s  i  i 

assumption  that  v.  and  g.  can  be  accurately  truncated  after  Nv  and 

terms  respectively  (i.e. 


s  t  ,  t,„. 
v.  =  {vi(0) , 


,  vJ(NM)} 


"gi  = 


,  g^Np}), 


eqn  8.29  can  be  reformulated  as 

Y  x  =  U  M  x  - (8.30) 

where  x  is  an  [mN*  x  1]  vector  and  Y,  U  and  M  are  matrices  of  dimension 

(N  x  mN*],  [N  x  m(N^+N^-l)|  and  [m(N*+N*-l)  x  mN* J  respectively,  defined  by: 


Y  = 


x  =  |  vj(0) 


v*(N3-1)  ]l 

1  V  V  '  1 


ytd)\ 

uf(0)  \ 

N 

\  0 

U  = 

• 

Ns\  o 

yt(N)  •  ■ 

t^X 

•  yC(N-Nj+l) 

ut(N-l )  • 

•  ut(N-N1-N1  +  l^ 
g  v 

M  = 


«i(1)  Jm 


g.(N  )  I 


0 


g .  (N  )  I 
g'  m 


From  eqn  8.30,  each  CWS/dual  CVS  pair  must  satisfy  the  relationship 
(Y  -  UM)  x  =  0.  Hence,  direct  identification  of  these  quantities  may  be 
posed  as  a  constrained  minimization  problem:  select  x  and  the  corresponding 
M  so  that  [ | ( Y  -  UM)  x||  is  minimized  subject  to  the  constraint  that  the 
vector  x  is  normalized  (i.e.  x*x  =  1).  This  formulation  yields  a  cost 
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function  of  the  form: 

J  =  | |  (Y  -  UM)  x  ||2  +  X  (1  -  x’x)  _ (8.31) 

where  X  is  a  Lagrange  multiplier  introduced  to  include  the  constraint,  and 

s  s  t 

the  solutions  to  this  problem  are  given  by  the  parameter  sets,  {  g^,  v^}. 

which  simultaneously  satisfy  the  conditions: 

3J/3x  =  0  3J/3g.(j)  =  0  ;  j  =  1,  •••  ,  Ng  _ (8.32a,b) 

Closer  examination  of  eqn  8.32a  yields  the  relationship 

(Y  -  UM)T(Y  -  UM)  x  -  X  x  =  0  _ (8.33) 

This  result  not  only  establishes  a  necessary  condition  on  x  for  J  to  be  a 

minimum,  but  it  also  suggests  a  procedure  for  computing  x  given  M.  In 

particular,  eqn  8.33  can  be  rewritten  as: 

(Y  -  UM)T(Y  -  UM)  x  =  Xx  - (8.34) 

But,  this  result  defines  an  eigenvalue/eigenvector  relationship  between  x 

and  X  (or  J).  Hence,  the  x  which  minimizes  J  is  simply  the  eigenvector 

T 

associated  with  the  smallest  eigenvalue  of  (Y  -  UM)  (Y  -  UM). 

Additional  conditions  for  the  solution  of  this  minimization  problem  are 
imposed  by  eqn  8.32b  which  yields  a  set  of  linear  equations  in  the  elements 
of  sgj.  In  particular, 

3J/3gi(j)  =  -2  {xl  l3M/3g.(j)}T  UT  (Y  -  UM)  x  }  =  0  ....(8.35) 
Defining  the  [N*  x  1J  vector  g  and  the  [m(N*+N*-l)  x  N*]  matrix  X  as: 

g  =  l  gjd)  g i ( Ng )  ]C 


the  following  relationships  may  also  be  established: 
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- (8.36a) 


M  x  =  X  g 
xC  (aM/Sg.d)!1 

•  =  XT  - (8.36b) 

x*  OM/ag^Ng))1 

Upon  substitution  of  these  expressions  into  the  set  of  equations  8.35,  the 
following  result  is  obtained: 

g  =  {XT  UT  U  X)”1  XT  UT  Y  x  (8.37) 

Hence,  the  elements  of  g^  are  simply  functions  of  the  known  system  inputs 

and  outputs  and  the  vector  x. 

Eqn  8.37  can  be  used  in  conjunction  with  eqn  8.34  to  produce  the 

following  iterative  algorithm  for  the  direct  identification  of 
rs  st.,  , 

{  git  Vj;  i=l,  •••  ,m}. 

Algorithm  8.1:  Direct  Identification  of  {Sgit  Sv!) 

For  each  i  (i  =  1,  •••  ,m), 

(1)  Initialize  x  to  x^. 

(2)  Solve  for  gQ  using  x^  and  eqn  8.37. 

(3)  Identify  x^  as  the  eigenvector  associated  with  the  smallest 

T 

eigenvalue  of  (Y-UMq)  (Y-UMq)  where  Mq  is  defined  (as  shown 
previously)  using  the  elements  of  gg. 

(4)  Repeat  steps  2  and  3  until  the  solution  converges. 

Algorithm  8.1  can  be  used  to  identify  the  sequences  SA  and  SV  required  to 
implement  the  control  law  defined  in  Section  8.1.2.  Of  course,  SV  must  also 
be  generated  to  perform  the  necessary  transformations  back  to  the  original 

g 

plant  frame.  With  V  known,  this  can  be  achieved  using  a  "deconvolution" 
algorithm  based  on  eqn  7.2. 

There  are  two  important  characteristics  of  this  algorithm  that  should 
also  be  highlighted  here.  First,  the  solution  for  g  defined  by  eqn  8.37 
takes  the  form  of  a  standard  least-squares  solution.  Second,  the 
eigenvector  update  in  step  3  of  the  algorithm  relies  only  on  the  minimum 
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T 

eigenvalue  of  (Y  -  UM)  (Y  -  UM)  and  its  associated  eigenvector.  From  a 
computational  point  of  view,  this  relationship  implies  that  an  iterative 
algorithm  such  as  the  'power  method'  may  yield  a  more  efficient  solution  for 
x  than  standard  eigenvalue/eigenvector  decomposition  algorithms.  Together, 
these  characteristics  suggest  that  it  may  be  possible  to  implement  a 
recursive  version  of  the  algorithm.  This  development  will  be  discussed  in 
more  detail  in  Section  8.3.2. 

8.3.1  Convergence  of  the  Algorithm 

To  ensure  that  Algorithm  8.1  identifies  the  correct  solutions,  its 
convergence  characteristics  must  be  investigated  in  detail.  One  way  to 
accomplish  this  task  is  to  demonstrate  that,  for  an  initial  value  of  x  (Xq) 
sufficiently  close  to  the  true  x  (xq),  the  algorithm  converges  to  the 
desired  solution.  As  shown  in  Appendix  8.2,  a  perturbation  analysis  of  the 
algorithm  in  the  vicinity  of  the  solutions  (sg^,  sv^}  which  set  J  to  zero 
clearly  establishes  this  characteristic.  As  noted  in  the  appendix,  the 
algorithm  may  jump  from  one  solution  to  another  under  certain  circumstances. 
Hovever,  these  special  conditions  correspond  to  pathological  situations  when 
two  or  more  eigenvectors,  v(z),  are  nearly  parallel  for  all  z  and  when  the 
corresponding  eigenfunctions,  g(z),  are  nearly  equal  for  all  z.  Since  the 
likelihood  of  this  situation  arising  in  practice  is  negligible,  initializing 
the  algorithm  with  accurate  seeds  will  ensure  that  the  proper  solutions  are 
obtained.  The  requirement  for  an  accurate  initial  seed  can,  in  fact,  be 
satisfied  using  the  standard  CSM  algorithm  in  conjunction  with  estimates  of 
the  plant  weighting  sequence  obtained  from  a  standard  identification 
algorithm.  From  this  perspective,  implementation  of  an  appropriate 
identification  algorithm  would  require  a  run  of  the  standard  identification 
algorithm  (discussed  in  Section  8.2)  for  just  as  long  as  necessary  to 
generate  the  seed  and  a  subsequent  switch  to  Algorithm  8.1. 

Further  investigations  of  the  general  convergence  characteristics  of 

the  algorithm,  however,  suggest  that  the  requirement  for  an  accurate  initial 
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seed  may  be  unnecessary.  To  verify  this  observation,  consider  first  the 
problem  of  identifying  the  eigenvalues  and  left  eigenvectors  of  an  [m  x  ml 
matrix  A  using  an  iterative  algorithm  similar  to  the  one  proposed  above. 
The  cost  function  for  this  problem  becomes: 

JA  .  | |  v*  (A  -  yl)  | |2  +  A  (1  -  vlv)  ....(8.38) 
where  y  is  the  desired  eigenvalue  and  v*  is  the  corresponding  left 
eigenvector.  The  stationarity  conditions  can  then  be  written  as: 

(A  -  yl)  (A  -  yl)^  v  =  A  v  ....(8.39a) 

y  =  {v*  (AT  +  A)  v)  /  2  v*v  - (8.39b) 

Upon  substituting  the  above  expression  for  y  into  eqn  8.38,  JA  becomes 
solely  a  function  of  v,  and  it  is  possible  to  establish  the  Hessian  of  JA 
(d2J/dv2)  as: 

1/2  =  (A  -  yI)(A  -  yI)T  -  ?}H  [jjH]*  -  Al  ....(8. AO) 


where  =  ~  {(A  -  yl)  +  (A  -  yI)T)  v 

v  v 

The  characteristics  of  JA  at  its  stationary  points  can  now  be 

investigated  in  more  detail.  Let  {y  ,  v*}  define  a  stationary  point  of  J.. 

0  0  A 

Then  under  the  normalizing  constraint  v*  vq  =  1,  eqns  8.39a,b  imply: 

vf  (A  -  y  I)  (A  -  y  I)Tv  =  A  and  v*  A  v  =  y 

o'  o  o  o  o  o  o 

which,  together  with  eqn  8. AO,  give 


2 

1/2  v 1  — — Tj-  v  =  vl  (A  -  y  I)(A  -  yI)Tv  -  A  -  2  {vl  A  v  -  y  }  =  0 
o  (jv2  o  o  o  '  o  '  o  o  o  o 

2  2 

Hence,  d  J/dv  must  be  rank  deficient  at  every  stationary  point  of  JA»  More 
information  can,  however,  be  established  by  recognizing  that  (A  -  yl)  can  be 


written  in  terms  of  its  singular  value  decomposition  as: 

A  -  yl  =  R R* 


_ (8. Al) 


where  E  =  diag{a  ,  Oj  *  • • •  ,  o  j  ,  o) .  Substituting  eqn  8.A1  into  eqn 

8. AO  yields: 

2 

1/2  =  R^rJ  -  Al  -  {R1E  r£  +  R2E  R^J  v  v^RjE  r£  +  RjE  Rj[}  - (8.A2) 
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Special  characteristics  of  the  stationary  points  can  now  be 

reintroduced.  In  particular,  eqn  8.39a  must  be  satisfied  at  the  stationary 

point.  But  since  the  stationary  point  was  reached  by  selecting  the  v  which 

2  t 

corresponds  to  the  minimum  X,  X  must  be  exactly  equal  to  a  and  vq  must 

T 

identify  the  first  row  of  R^.  Using  these  relationships,  the  following 

result  can  be  derived  from  eqn  8.42: 

2 

1/2  vl  ^  v  =  (o2  -  X)  -  4  a2(vl  w  )2  =  -  4  ^(v1  w  )2  _ (8.43) 

o  ^v2  o  -  -oo  -  o  o' 

where  wq  identifies  the  first  column  of  R2  at  the  specified  stationary 

t  d2J 

point.  As  shown  previously,  v^  — ^  v  =  0.  But  from  eqn  8.43,  it  is  now 

dv  0 

2 

evident  that  this  can  be  achieved  in  two  different  ways:  either  0  =  0  or 

v*  wq  =  0.  Thus,  the  only  stationary  points  of  JA  which  do  not  correspond 

to  the  correct  solutions  of  the  problem  are  those  for  which  the  minor  input 

and  output  principal  directions  of  (A  -  PQI)  are  orthogonal  (i.e.  v^  wq  =  0) 
2 

and  a  *  0.  In  these  special  cases,  eqn  8.43  clearly  demonstrates  that,  for 

2 

3  J 

small  perturbations  (v  =  vq  +  &v)  away  from  vq,  v  — ^  v  will  always  be 

3v 

negative.  In  essence  then,  each  of  these  special  stationary  points  is  a 
local  maximum  and  the  algorithm  will  only  stop  at  one  of  these  points  if  the 
initial  value  for  v  is  exactly  equal  to  vq.  Thus,  the  only  stable  solutions 
identified  by  the  algorithm  will  correspond  to  the  true  eigenvalues  and 
eigenvectors  of  A. 

A  rigorous  investigation  of  the  general  convergence  characteristics  of 

the  original  problem  requires  the  identification  and  analysis  of  the  Hessian 

of  J  (where  J  is  defined  by  eqn  8.31)  in  a  manner  analogous  to  that  shown 

above.  Although  it  is  possible  to  generate  an  expropriate  expression  for 

this  quantity,  the  result  obtained  does  not  exist  in  a  form  that  can  be 

readily  investigated  to  identify  the  characteristics  of  J  at  its  stationary 

points.  It  is,  however,  possible  to  extend  the  discussion  and  results  above 

to  this  original  problem  by  recognizing  that  the  solutions  {sgj,  sv*}  for 

which  J  =  0  also  define  the  functions  g^(2)  and  v^(z)  which  satisfy  the 
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relationship  {G  -  g ^ 1}  =  0.  The  format  of  the  original  problem  is, 

therefore,  identical  to  that  examined  above,  and  the  conclusions  reached 
should  extend  directly.  This  observation  suggests  the  following  result: 

Conjecture:  The  only  stable  solutions  {Sg^,  Sv|}  generated  by  Algorithm 

8.1  will  be  those  for  which  J  (defined  by  eqn  8.31)  is  identically  zero 
(i.e.  the  correct  solutions). 

This  result  implies  that  Algorithm  8.1  may  be  used  independently  to  identify 
s  s  t 

the  m  solutions  {  g^,  v^}  required  to  implement  the  control  law  of  Section 

8.1.2. 

8.3.2  A  Recursive  Implementation  of  the  Algorithm 

Although  Algorithm  8.1  will,  in  general,  identify  the  desired 
solutions,  the  iterative  nature  of  the  algorithm  makes  it  difficult  to 
implement  when  on-line  identification  is  required.  It  is  interesting  to 
note,  however,  that  the  solution  for  g.  defined  by  eqn  8.37  is  also  the 
solution  to  a  scalar  model  identification  problem  of  the  form: 

Ng 

y|(k)  =  £  u.(k-j)  g.(j) 

j=l  1 

where  y^  and  u^  are  the  appropriate  elements  of  y  and  u  (defined  in  eqn  8.4) 
and,  hence,  represent  the  inputs  and  outputs  of  the  system  filtered  by  the 
appropriate  function  v!(z).  Viewing  the  problem  from  this  perspective,  it 
seems  reasonable  to  anticipate  that  a  two-stage  ’ecursive  algorithm  can  be 
used  to  generate  the  desired  parameter  estimates.  One  such  algorithm  is 
proposed  below: 

Algorithm  8.2:  Recursive  Algorithm  for  the  Identification  of  {sg,,sv|f} 

(1)  At  time  t,  let 

y^t)  =  v\  y(t)  u.(t-l)  =  v|  u(t-l) 

dj(t)  =  l  u^t-l)  Uj( t-2)  •  •  •  u^t-N*)  ] 

Then, 
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gjU)  =  g.(t-l)  +  K(t)  {  y 4 ( t )  -  dj(t)  g.(t-l)  } 


K(t)  =  P(t-l)  d.(t)  /  {  1  +  d[(t)  P(t-l)  d.( t)  } 

P(t)  =  {  I  -  K(t)  dj(t)  }  P(t-l) 

(2)  Using  the  elements  of  g^(t),  (g.(k),  k  =  1,  •••  ,  N* } ,  define 

N1 

g  « 

C^U)  =  yl(t-j)  -  £  g .  (k)  ut(  t-j-k)  ;  j  =  0,  •••  ,  N*-l 

k=l  1 

YC  =  [  ac(0)  •  •  •  a^N^-l)  J 

Then, 

T(t)  =  r( t-1 )  +  r  rC 

z(t)  =  {  T(t)  -  kl  }  x(t-l) 


(where  k  is  chosen  so  that  |o  -  k|  >  | -  k|  for  all  i) 


x(t) 


r  vi(0) 


v . (N  -1 ) 
l  v 


z(t)  /  /{zt(t)  z(t)} 


Part  1  of  Algorithm  8.2  is  simply  a  recursive  least-squares  algorithm 

applied  to  appropriately  filtered  input/output  data.  Part  2  uses  the 

*  X 

current  estimates  of  g^(t)  to  update  (Y  -  UM)  (Y  -  UM)  and  implements  an 
"inverse  power  method"  algorithm  (assuming  that  only  small  changes  in  x  are 
introduced  by  the  additional  data)  to  update  x.  It  is  interesting  to  note 
that  the  need  to  specify  the  constant  k  which  ensures  that 
|a(T)  —  k |  >  |<T.(r)  -  k|  can  be  eliminated  by  focusing  attention  on  T  * 
instead  of  T  [ M0H2 J .  In  particular,  T  *  can  be  updated  recursively  using 
the  'matrix  inversion  lemma',  which  states  that: 


r  J(t)  =  [ T< t-1)  +  yyV1  = 

and  since  o(T  *)  =  1  /  <7(0,  the 
identical  to  the  one  associated 


r_1 c t-i )  - 

eigenvector 
with  <x(f 


r  V-p  yy*  r  1(t-i)  . 
l  +  y1  r_1( t-i)  y  ’ 

associated  with  <t(0  is 
).  Thus,  a  "power  method" 


algorithm  can 


be  applied  directly  to  T 


in  a  manner  similar  to  that  shown 
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in  step  2  of  Algorithm  8.2  to  compute  x(t). 


8.4  A  Scalar  Difference  Equation  Formulation  of  the  Problem 

For  situations  when  a  large  number  of  elements  are  required  in  each  CWS 
to  produce  an  accurate  system  description,  the  direct  identification 
algorithm  of  the  previous  section  and  the  use  of  the  resulting  estimates  to 
predict  future  subsystem  outputs  may  create  a  significant  computational 
burden.  In  these  cases,  both  the  identification  and  prediction  problems  may 
l  be  reformulated  in  terms  of  scalar  difference  equations  to  reduce  the 

computational  complexity  of  the  self-tuning  algorithm.  The  desired  results 
are  derived  below. 


8.4.1  Direct  Identification  of  Subsystem  Difference  Equation  Models 

For  systems  with  no  unstable  branch  points,  gj(z)  can  be  described 
exactly  by  a  rational  infinite-length  polynomial  in  powers  of  z  ^  (whose 
coefficients  decay  to  zero)  as  shown  in  Chapter  7.  Under  these  conditions, 
it  is  also  possible  to  identify  two  finite-length  polynomials, 


and 


.  .  -N1 

a. (z)  =  1  +  a .  ( 1 )  z  +  •••  +  a.(N*)  z  3 

A  1  la 

i  ~Nh 

b.(z)  =  b. (1)  z  +  •••  +  b.fN*)  z 


such  that  the  function  b./  a.  describes  g.  to  a  specified  level  of  accuracy. 

As  a  result,  g.  may  be  replaced  by  b./  a.  and  the  identification  algorithm 
of  the  previous  section  can  be  extended  directly  to  this  new  formulation. 


In  particular,  eqn  8.27  can  be  rewritten  as: 


aj  v|  y(t)  =  bj  vj  u(t) 


_ (8.44) 


or,  in  convolutional  terms,  as: 

s  .  s  t  ,  s  s,  .  s  t  .  s 

a.*  v.*  y  b.*  v.*  u 

l  l  '  it 

Expanding  this  relationship  into  matrix  form  produces: 


A.  V.  y 

l  l  } 


=  B.  V.  u 

l  l 


_ (8.45) 

where  y,  u,  and  V.  are  defined  as  shown  in  Section  8.3,  while  A.  and  are 


given  by: 


Y  L  x  =  U  M  x 


_ (8.46) 


where  x  is  defined  as  shown  previously;  Y  and  U  are  now  matrices  of 
dimension  [N  x  m(N*+N*)J  and  (N  x  m(N^+N*-l) I  respec ti vely,  defined  in  a 
manner  analogous  to  that  shown  in  Section  8.3;  and  L  and  M  are  matrices  of 
dimension  [m(N*+N*)  x  mN*]  and  [m(N^+N^-l)  x  rnN^,  1  respectively,  defined  by: 


218 


bi<Nb> 


*  *  f 

Parameter  identification  for  the  polynomials  a^  Ik,  and  v^  can  now  be 
formulated  as  a  constrained  minimization  problem  with  the  following  cost 
function: 

J  =  I |  (YL  -  UM)  x  | | 2  +  X  (1  -  x'x)  - (8.47) 

In  a  manner  analogous  to  that  of  Section  8.3,  the  following 
relationships  can  be  established: 


Lx  = 


M  x  =  Xb  b  - (8.48a,b) 


where  xt  =  {  |  0*  J1 

a  =  [a.(l)  •  •  •  a.fN*)]1  b  =  [b.(l>  •  •  •  b^N*)]1 

and  Xa  and  X^  are  defined  in  the  same  manner  as  shown  previously,  but  with 
different  dimensions;  namely,  X  and  X.  are  dimensioned  [m(N1+N1-l)  x  N1  ] 

3D  3  V  3 

and  [m(Nb+N^-l)  x  N*]  respectively  .  Furthermore,  if  Y  is  partitioned  as: 


'It  1  Yk-1 


y  =  l  y 

where  y^  contains  the  first  m  columns  of  Y  and  Y^_ ^ 


is  a  matrix  of  dimension 


[ N  x  m(N*  +  N*-  1)1,  then 

(YL  -  UM)  x  =  Y  x  4  j  Yk_j  Xg  |  -  U  Xb  |  6  - (8.49) 


with  0  =  (  a*  |  b*  J* 

Using  the  relationships  above,  the  conditions  which  identify  a  stationary 
point  of  J  become: 

(YL  -  UM)T  (YL  -  UM)  x  =  X  x  - (8.50a) 

0  =  {  DT  D  J'1  DT  Y  x  - (8.50b) 

where  D  =  (  -  Y.  ,  X  |  U  X.  ] 

k-1  a  b  1 

The  form  of  these  results  is  identical  to  that  described  in  Section 
8.3  for  the  weighting  sequence  problem.  As  such,  Algorithms  8.1  and  8.2  can 
be  used  for  this  new  problem  formulation  simply  by  making  the  substi tutions 
implied  by  eqn  8.50.  Furthermore,  the  convergence  characteristics  of  the 
modified  algorithms  will  be  exactly  the  same  as  those  given  earlier.  When 
using  this  difference  equation  formulation  for  the  characteristic  subsystems 
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however,  it  also  becomes  necessary  to  modify  the  output  prediction 

relationships  associated  with  the  predictive  control  law  which  were 
developed  in  Section  8.1.1.  The  required  modifications  are  discussed  below. 

8.4.2  Output  Prediction 

With  parameter  estimates  for  a..,  b^ ,  and  v^  available,  the  plant 
transfer  function  matrix  can  be  written  in  the  following  form: 

G(z)  =  V  Aa_1  V  =  {V  Aa_1  V}  {V  Ah  V)  =  (A' )_1  B'  - (8.51) 

where  A&  =  diag  {a^,  •••  ,  am)  and  Ah  =  diag  {b^,  •••  ,  b^} .  Output 

prediction  can  now  be  formulated  in  a  manner  analogous  to  that  for  the 
standard  difference  equation  formulation  discussed  at  the  end  of  Section 
8.1.1.  However,  it  should  be  noted  that  output  prediction  will  now  be 
performed  entirely  in  the  eigenvector  frame.  With  this  in  mind,  it  can  be 

shown  that  the  best  available  estimates  of  y(t+k)  at  time  t  are  given  by: 

y ( t+k I t )  =  Fk  y(t)  +  Efc  W  Ah  Au(t+k)  ....(8.52) 

and  Pj^  are  defined  by  the  modified  Diophantine  identity: 

Ek  W  A  +  z"k  FR  =  I  (8.53) 

"  f  " 

Furthermore,  if  Gk  and  Gk  are  redefined  as 

\  ®k  ■  5"\v 

the  relationship  between  y.(t+k|t),  (Au..(t+j);  j=l , • • • ,k-l) ,  and  r^(k) 

is  precisely  the  same  as  that  shown  in  eqns  8.20  and  8.21.  Hence,  the 

predictive  control  law  for  this  alternate  subsystem  description  is  identical 
to  that  defined  by  eqn  8.23. 

As  before,  the  Diophantine  relationships  provide  a  convenient  means  of 
describing  the  relationship  between  future  outputs  and  past,  present,  and 
future  information.  It  should,  howevei ,  be  noted  that,  in  practice,  the 

desired  eigenframe  output  predictions  may  be  generated  in  a  manner  which  is 
computationally  more  efficient  than  that  suggested  by  eqn  8.52.  Using  an 
approach  analogous  to  that  discussed  in  Section  8.1.3,  each  subsystem  output 
can  be  described  by  the  scalar  relationship: 


where  E^ 
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y^(t+k)  =  y,(t+k-l)  +  Ay^t+k-l)  +  b.  Au(t+k-l)  - (8.54) 

and  this  result  can  be  used  to  simulate  the  desired  output  predictions 
provided  y(t)  is  generated  by  transforming  the  known  plant  outputs  into  the 
eigenframe  using  V. 


8.5  Computational  Considerations  for  Implementation 

As  with  any  control  algorithm,  the  feasibility  (practicability)  of 
implementations  involving  the  multivariable  self-tuning  algorithm  proposed 
in  this  chapter  depends  upon  the  computational  complexity  of  the  algorithm. 
This  is  a  particularly  pressing  issue  for  multivariable  systems  since  the 
degree  of  complexity  increases  rapidly  with  the  dimension  of  the  system. 
Thus,  it  is  not  sufficient  to  address  only  the  theoretical  foundations  of 
the  control  algorithm;  one  must  also  consider  its  computational 
requirements.  The  following  discussion  addresses  these  concerns. 

A  useful  measure  of  the  computational  complexity  of  on-line  control 
algorithms  is  the  number  of  scalar  multiplication-additions  required  to 
implement  the  algorithm.  Under  the  assumption  that  all  algorithm  operations 
must  be  performed  using  a  serial  implementation  (as  is  currently  done  in 
practice),  this  quantity  can  be  identified  by  recognizing  that  a  [j  x  k)- 
matrix  by  [k  x  n]-matrix  multiplication  requires  (jkn)  scalar  operations  and 
that  (n  /n)  is  a  readily  available  upper  bound  on  the  number  of  operations 
required  to  perform  [n  x  nl  matrix  inversions  and  eigenvalue/eigenvector 
decompositions.  In  addition,  several  other  variables  which  directly  affect 
the  computational  complexity  of  the  algorithm  can  be  defined  as  follows: 


m  =  plant  dimension 


N_  =  number  of  matrix  elements  in  SG  N1  =  number  of  elements  in  sg. 

G  R  °i 


N.„  =  number  of  matrix  elements  in  A,B  N1.  number  of  elements  in  a.,b. 
AB  ab  it 


i  s  t 

N  =  number  of  elements  in  v. 
v  1 


1  s 

N  =  number  of  elements  in  w. 
w  1 


Nu  =  control  horizon  for  subsystem  i 
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nJ  =  minimum  prediction  horizon  for  subsystem  i 
N»  =  N*  maximum  prediction  horizon  for  subsystem  i 

z  y 

For  simplicity,  it  will  be  assumed  that  A  is  diagonal  and  that  N1  =  N  , 

o  o 

"lb  -  Nab’  =  V  Nw  =  Nw>  Nu  =  V  N1  =  and  N2  =  =  Ny  for  a11  i* 

Using  these  definitions  and  assumptions,  the  per-cycle  computations  required 

for  each  portion  of  the  proposed  algorithm  can  be  estimated.  General 

expressions  for  these  estimates  are  displayed  in  Table  8.1. 

Clearly,  quantifying  the  computational  complexity  of  the  general 

algorithm  is  a  difficult  task  involving  a  large  number  of  system-dependent 

variables.  It  is,  however,  instructive  to  identify  the  complexity 

associated  with  several  representative  situations.  This  has  been 

accomplished  for  different  algorithm  configurations,  and  the  results  are 

summarized  in  Tables  8.2  and  8.3.  For  purposes  of  comparison,  the 

computational  complexity  of  the  multivariable  algorithm  proposed  by  Mohtadi 

et  al  [M0H1]  (and  denoted  here  by  MSC)  has  also  been  identified  under  the 

same  assumptions.  [Note:  The  algorithm  structure  identified  in  these 

tables  has  been  coded  in  the  following  manner: 

A  —  Proposed  algorithm  with  G  =  A  ^  B 


B  —  Proposed  algorithm  with  G  =  E  G(i)  z 

i  =  l 

•  s  s  t  ■ 

C  —  Proposed  algorithm  with  identification  of  {  g^,  v^}  using 

Algorithm  8.2 

s  t 

D  —  Proposed  algorithm  with  identification  of  [a^,  b^,  v^  using 


a  modified  version  of  Algorithm  8.2 
E  —  MSC  algorithm  with  G  =  A-*  B  .| 


Although  the  results  presented  in  Tables  8.2  and  8.3  are  significantl v 
affected  by  the  assumptions  made,  they  highlight  several  important 
characteristics  of  the  proposed  self-tuning  algorithm.  First,  in 
applications  which  do  not  require  on-line  identification,  "control-only" 
versions  of  the  proposed  algorithm  prove  to  be  computationally  more 
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Table  8.1:  Serial  Calculations  Required  to  Implement  Proposed  Algorithm 


Model  Parameter  Identification: 


G(z)  =  A-1B 

2m  {  [ (m+1)  NabJ2  +  2  (m+1)  } 

ng 

G(z)  =  I  G(i)  z  1 
i  =  l 

O 

2m  {  (m  NgP  +  2  m  Nfi  } 

SA 

3  m3  N2  +  m2  N  (2N  +  5)  +  m  (2N2  +  4N  ) 

v  v  g  g  g 

Aa’  Ab 

3  m3  N2  +  m2  N  (4N  .  +  5)  +  8m  (N2  +  N  .  ) 

v  v  ab  ab  ab' 

s  s 

Identification  of  A,  W  via 

-i 

CSM  (required  for  A  B  and  E  G(i)  z  ): 

i  =  l 

G(z)  =  A_1B 

mA  +  m3/2  {N2+  N  +  2/m  -  2) 

y  y 

+  m2/2  {Ny  ( Ny - 1 )  +  Nab  (2Ny-  NAR+  1)}  - 

ng 

G(z)  =  S  G(i)  z_i, 
i  =  l 

m4  +  m3/2  {Ny+  Ny+  2/m  -  2}  +  mZNy/2  (Ny-1) 

Dual  CVS  Calculations: 

m3/2  {N2  +  N  +2  /m) 

Output  Prediction: 

G(z)  =  A_1B 

N  m  {  m  (N._  +  N  +  N  )  +  N._) 
y  1  '  AB  w  v'  AB 

ng 

G(z)  =  Z  G(i)  z  1 
i  =  l 

O 

N  m  {N„  +  N  +  N  } 
y  1  G  w  vJ 

SA 

m  N  N  +  2  m2N 

y  g  v 

A  ,  A. 
a  b 

2  m  N  N  .  +  2  m2N 
y  ab  v 

Control  Law  Calculation: 

m  N  {N  +  N  N  +  N2  /N  +1} 
u  y  u  y  u  u 

Coordinate  Transformation:  m4  N 
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Table  8.2:  Summary 

'  of  Required 

Computations 

for  2 -Hi mens i ona 1 

Systems 

Assumptions  (where  applicable) 

: 

m  =  2 

Ng  -  25 

N  =  25 
g 

N  =  5 

V 

Ny  =  10 

nab=  6 

N  ,  =  6 

ab 

N  =  5 
w 

Algorithm  Structure 

A 

B 

c 

D 

E 

N 

u 

=  1 

Identification 

2400 

11170 

4530 

1980 

1630 

Control 

825 

1465 

605 

345 

500 

Total 

3225 

12635 

5135 

2325 

2130 

N 

u 

=  2 

Identification 

2400 

11170 

4530 

1980 

1630 

Control 

930 

1570 

710 

450 

900 

Total 

3330 

12740 

5240 

2430 

2530 

N 

u 

=  3 

Identification 

2400 

11170 

4530 

1980 

1630 

Control 

1120 

1760 

900 

640 

1740 

Total 

3520 

12930 

5430 

2620 

3370 

Table  8.3:  Summary  of 

Required  Computations 

for  3 

-Dimensional 

Systems 

Assumptions  (where  applicable): 

m  =  3  Ng  . 

=  25 

N  =  25 
g 

N 

V 

=  5 

Ny  =  10 

nab: 

=  6 

N  .  =  6 

ab 

N 

w 

=  5 

Algorithm  Structure 

A 

B 

c 

D 

E 

N 

u 

=  1 

Identification 

6590 

37090 

9000 

4790 

4150 

Control 

1730 

3260 

905 

555 

1140 

Total 

8320 

40350 

9905 

5345 

5290 

Nu 

=  2 

Identification 

6590 

37090 

9000 

4790 

4150 

Control 

1885 

3415 

1060 

710 

2530 

Total 

8475 

40505 

10060 

5500 

6680 

Nu 

=  3 

Identification 

6590 

37090 

9000 

4790 

4150 

Control 

2175 

3705 

1350 

1000 

5630 

Total 

8765 

40795 

10350 

5790 

9780 
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efficient  than  currently  available  algorithms  such  as  the  MSC  algorithm. 
Indeed,  for  the  simplest  control  problem  (i.e.  m  =  2,  =  1),  the  use  of 

algorithm  structure  C  or  D  yields  an  algorithm  with  complexity  comparable  to 
that  of  the  MSC  algorithm.  For  more  difficult  control  problems  (i.e.  Nu  >  1 
or  m  >  2),  the  proposed  algorithm  (using  structure  A,  C,  or  D)  becomes  much 
less  complex  than  the  MSC  algorithm.  This  latter  result  can  be  attributed 
directly  to  the  fact  that  all  control  law  calculations  in  the  proposed 
algorithm  involve  only  scalar  operations  for  each  subsystem,  so  increases  in 
or  m  do  not  affect  this  algorithm  as  dramatically  as  they  affect  the  MSC 
algorithm.  From  the  observations  above,  it  is  clear  that,  when  off-line 
identification  is  used  to  identify  the  appropriate  controller  parameters, 
"control-only"  versions  of  the  proposed  algorithm  become  practical  for  a 
wide  range  of  systems  using  current  serial  capabilities. 

A  second  characteristic  highlighted  by  the  results  in  Tables  8.2  and 
8.3  is  the  dramatic  impact  of  the  identification  task  on  the  computational 
complexity  of  any  multivariable  algorithm.  Indeed,  in  most  instances,  the 
computations  required  for  on-line  identification  are  significantly  larger 
than  those  required  for  the  control  calculations.  Again  however,  a 
comparison  between  the  self-tuning  algorithm  proposed  here  and  the  MSC 
algorithm  is  instructive.  Since  the  plant  identification  algorithm  used 
with  structure  A  is  identical  to  that  used  in  the  MSC  algorithm,  one  should 
expect  the  self-tuning  algorithm  proposed  here  (implemented  using  structure 
A)  to  yield  a  more  complex  procedure  than  that  associated  with  the  MSC 
algorithm  because  structure  A  requires  the  additional  computation  of  the  CWS 
and  CVS.  This  is  exactly  the  case  for  Ny  =  1.  However,  as  the  diffimlty 
of  the  control  problem  increases  (requiring  larger  values  of  Ny)i  the  more 
efficient  control  law  of  the  proposed  self-tuning  algorithm  offsets  this 
initial  disadvantage.  As  a  result,  this  algorithm  (implemented  using 
standard  recursive  identification  techniques)  becomes  comparable  in 
complexity  to  the  MSC  algorithm  for  values  of  N  greater  than  1.  Another 
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much  more  important  observation  is  that,  when  direct  subsystem 
identification  is  used  (structure  D),  the  proposed  self-tuning  algorithm 
becomes  comparable  in  complexity  to  the  MSC  algorithm  for  =  1  and  much 
more  efficient  than  the  MSC  algorithm  for  Nu  >  1. 

It  is  clear  however  that,  when  on-line  identification  is  required,  the 

complexity  of  the  algorithm  proposed  here  (and  indeed  any  other 

multivariable  algorithm)  effectively  precludes  its  implementation  on  all  but 

small-dimensioned  systems  sampled  at  sufficiently  slow  rates  when  using 

serial  computations.  This  must  not  be  construed  as  a  reason  to  ignore  the 

algorithm.  By  its  very  nature,  the  general  multivariable  control  problem  is 

particularly  well-suited  to  the  computational  savings  offered  by  the  advent 

of  VLSI  technology  for  parallel  processing.  For  example,  as  suggested  in 

[WHI1],  the  computational  complexity  of  matrix  operations  can  be  reduced 
3 

from  0(n  )  to  0(n)  using  parallel  implementations,  and  this  advance  alone 
produces  a  significant  reduction  in  the  complexity  of  the  algorithm  as  shown 
in  Table  8.4.  Furthermore,  the  characteristic  subsystem  approach  offers 
even  further  computational  savings  due  to  the  fact  that  each  subsystem  can 
be  handled  independently.  Hence,  the  m  subsystem  control  law  algorithms  can 
be  performed  simultaneously  using  parallel  implementations.  An  indication 
of  the  additional  savings  that  can  be  achieved  using  this  approach  is  given 
in  Table  8.5.  These  results  clearly  suggest  that,  using  emerging  parallel 
processing  capabilities,  the  computational  complexity  of  the  proposed 
multivariable  algorithm  can  be  reduced  to  the  same  order  as  that  associated 
with  currently  existing  SISO  self-tuning  algorithms. 

8.6  Simulation  Results 

Computer  simulations  were  used  to  demonstrate  the  multivariable  self¬ 
tuning  control  algorithm  described  in  this  chapter.  The  open-loop  system 
used  for  these  simulations  was  identical  to  that  described  in  Section  7.5 
with  the  system  transfer  function  matrix  given  by  eqn  7.31.  As  mentioned  in 
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Table  8.4:  Required  Computations  using  Parallel  Matrix  Operations 
Assumptions  (where  applicable): 


Ng  =  25 

N  =  25 
g 

Nv  -  5 

N  = 

y 

10 

NAB=  6 

Nab=  6 

N  =  5 
w 

N  = 
u 

1 

A 

Algorithm  Structure 
BCD 

E 

Serial  Operations 

3225 

12635 

5135 

2325 

2130 

Parallel  Matrix 
Operations 

925 

1435 

1375 

755 

405 

Percent  Reduction 

71 

89 

73 

68 

81 

Serial  Operations 

8320 

40350 

9905 

5345 

5290 

Parallel  Matrix 
Operations 

1500 

2525 

2365 

1300 

645 

Percent  Reduction 

82 

94 

76 

76 

88 

TABLE  8.5:  Required  Computations  using  Parallel  Subsystem  Configurations 
Assumptions  (where  applicable): 


Ng  =  25 

Ng  =  25 

N  =  5 

V 

N  = 

y 

10 

NAB=  6 

Nab=  6 

N  =  5 
w 

N  = 
u 

1 

m  =  2: 

Parallel  Matrix 
Operations 

A 

925 

Algorithm  Structure 
BCD 

1435  1375  755 

E 

405 

Parallel  Loop 
Calculations 

815 

1195 

980 

540 

330 

Percent  Reduction 

12 

17 

29 

28 

19 

m  =  3: 

Parallel  Matrix 
Operations 

1500 

2525 

2365 

1300 

645 

Parallel  Loop 
Calculations 

1225 

1840 

1285 

715 

450 

Percent  Reduction 

18 

27 

46 

45 

30 
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Section  7.5,  there  are  two  unstable  branch  points  associated  with  this 
transfer  function  matrix.  However,  because  of  the  proximity  of  these  branch 
points  to  one  another,  their  effects  on  the  CVS  and  CVS  are  not  significant 
initially.  Indeed,  the  results  presented  in  Section  7.5.2  clearly  demon¬ 
strate  the  accuracy  that  can  be  achieved  using  5-term  sequences.  Therefore, 
the  control  algorithm  of  Section  8.1  was  implemented  on  this  system  using  5- 
term  operators,  V  and  V,  to  perform  the  necessary  transformations. 

For  demonstration  purposes,  the  simulations  were  generated  by  initially 
applying  a  positive  unit  step  input  to  loop  1  followed  by  the  application  of 
a  negative  unit  step  input  to  loop  2  at  sample  25.  The  closed-loop  response 
of  the  plant  and  the  outputs  from  the  controller  were  observed  for  50 
samples.  As  on-line  identification  was  not  used  in  the  simulations,  the 
specification  of  subsystem  control  laws  and  the  prediction  of  future 
subsystem  outputs  were  accomplished  using  fixed  parameters  defined  by  the 
transfer  function  matrix  in  eqn  7.31.  The  minimum  and  maximum  prediction 
horizons  (N^  and  N*)  were  set  to  1  and  10  respectively  for  both  subsystems 
throughout  the  simulations,  while  the  control  horizons  (N*)  and  weighting 
factors  (p^)  were  varied. 

Initial  simulation  runs  examined  the  effects  of  changing  Nu  (assuming 
equal  values  for  both  subsystems)  with  set  to  zero.  The  results  for 
Ny  =  1  and  Ny  =  3  are  presented  in  Figures  8.3  and  8.4.  Additional 
information  (i.e.  the  settling  time  of  system  response,  peak  control 
activity,  and  maximum  interaction  as  a  percent  of  the  commanded  input)  for 
these  runs  is  presented  in  Table  8.6.  In  general,  the  closed-loop  responses 
for  these  tests  were  stable  and  nonoscillatory .  For  =  1,  the  response 
was  sluggish  and  a  moderate  amount  of  interaction  was  present,  but  the 
control  activity  required  to  produce  this  response  was  particularly  low.  As 
Nu  was  increased  to  larger  values,  the  system  responded  much  more  rapidly 
and  interaction  became  negligible.  However,  the  price  for  these 
improvements  was  unacceptably  large  control  activity  in  both  channels. 
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time  (samples) 


To  maintain  rapid  response  while  reducing  the  level  of  control 

activity,  non-zero  control  weights  were  included  in  the  control  law  for 

values  of  greater  than  1.  In  particular,  pj  was  set  to  1  for  each 

subsystem,  and  the  observed  closed-loop  response  characteristics  from  the 

resulting  simulations  are  included  in  Table  8.6.  These  results  clearly 

demonstrate  that  rapid  response  was  maintained  with  a  more  reasonable  level 

of  control  activity,  but  the  maximum  level  of  interaction  was  significantly 

larger.  To  overcome  the  increased  interaction,  the  additional  flexibility 

associated  with  the  ability  to  change  the  subsystem  control  horizons 

1  2 

independently  was  used  to  improve  the  results.  For  Ny  =  3,  Ny  =  2,  and 
P1  =  p2  =  1*  acceptably  rapid  response  was  obtained  using  reasonable  control 
activity  while  minimizing  the  peak  level  of  interaction.  The  simulation 
results  for  this  configuration  are  displayed  in  Figure  8.5. 

All  of  the  simulation  results  presented  thus  far  were  obtained  without 

the  aid  of  high  frequency  alignment  compensation,  and  they  clearly  show  that 

acceptable  levels  of  interaction  can  be  achieved  without  this  additional 

compensation.  It  is,  however,  interesting  to  compare  these  results  with 

those  obtained  by  including  appropriate  high-frequency  alignment 

compensation.  The  compensator  used  for  this  purpose  was  identical  to  that 

used  in  Section  7.5  (eqn  7.32).  Again,  5-term  operators,  W  and  V,  were  used 

to  perform  the  necessary  transformations,  and  the  control  algorithm  of 

Section  8.1  was  implemented.  In  general,  the  use  of  this  precompensator 

left  the  desired  system  responses  and  the  corresponding  levels  of  control 

activity  relatively  unaffected,  but  it  did  reduce  interaction  (as  shown  in 

1  2 

Table  8.6).  A  representative  example  (Ny  =  Nu  =  3;  =  1)  of  the 

compensated  system  response  for  this  modified  system  is  presented  in  Figure 

8.6. 

The  simulations  presented  here  demonstrate  clearly  that  appropriate 
control  of  the  characteristic  subsystems  translates  directly  into  effective 
loop-by-loop  control  of  the  multivariable  system.  Moreover,  it  is  possible 
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to  use  the  additional  flexibility  associated  with  the  independent  control  of 
each  individual  subsystem  to  fine  tune  system  performance;  a  facility  not 
explicitly  available  when  a  single  scalar  cost  function  is  applied.  The 
result  is  the  achievement  of  good  system  response  with  moderate  control 
activity  and  acceptably  low  levels  of  interaction. 
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Table  8.6:  Performance  Comparison  of  Simulation  Results 


Settling 

Peak 

Maximum 

Time 

Control 

Interaction 

(samples) 

Activity 

(%) 

N1 

u 

N2 

u 

p 

Loop  1  Loop  2 

Loop  1  Loop  2 

Loop  1 

Loop  : 

Without 

High  Frequency  Alignment 

Compensation: 

1 

1 

0 

15  16 

.16  .14 

10.4 

14.5 

2 

2 

0 

5  5 

.88  .73 

15.2 

16.5 

3 

3 

0 

3  2 

1.50  1.65 

4.8 

4.8 

4 

4 

0 

1  2 

1.53  2.04 

1.6 

1.8 

2 

2 

1 

4  5 

.62  .50 

16.1 

16.0 

3 

3 

1 

4  4 

.53  .44 

16.1 

15.4 

4 

4 

1 

6  7 

.51  .43 

16.2 

15.6 

3 

2 

1 

4  4 

.57  .49 

13.9 

12.1 

With  High  Frequency  Alignment  Compensation: 

1 

1 

0 

15  17 

.13  .15 

8.7 

9.6 

3 

3 

1 

5  5 

.68  .81 

7.8 

5.8 

3 

2 

1 

5  6 

.65  .84 

7.7 

6.1 
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Figure  8.6:  Simulation  Results  With  High  Frequency  Alignment  Compensation 


Appendix  8.1  A  Difference  Equation  Formulation  of  the  CSM  Algorithm 
Consider  the  system  described  by  the  transfer  function  matrix 

G(z)  =  A_1(z)  B(z)  _ (A8.1) 

k  .  k 

where  B(z)  =  L  B(i)  z_1  and  A(z)  =1+1  A ( i )  z-1 
i=0  i=l 

The  defining  equation  for  the  eigenfunctions  and  characteristic  directions 

of  this  system  is,  therefore,  given  by  A-^  B  v.  =  g.  w.;  or  alternatively, 

B  v.  =  g.  A  w.  - (A8.2) 

After  transforming  this  relationship  into  the  time  domain  using  the  inverse 

z-transform,  the  following  convolution  relationship  can  be  established: 

SB  *  Sw.  =  sg.  *  SA  *  Sw.  _ (A8.3) 

1  6l  l 

where  sg^  and  sv^  are  the  i**1  CVS  and  CVS  of  the  system  and  SB  and  SA  are 

given  by  SB  =  (B(0),  ••,  B(k)J  and  SA  =  |X,  A(l),  *••,  A(k)]. 

Using  eqn  A8.3,  the  desired  algorithm  can  be  generated  in  much  the  same 
way  as  the  standard  CSM  algorithm.  In  particular,  at  the  initial  stage  of 
the  convolution,  eqn  A8.3  indicates  that  B(0)  w^(0)  =  g^(0)  w^(0).  Hence, 
gjfO)  and  w^fO)  are  the  itn  eigenvalue/eigenvector  pair  of  B(0).  For 
j  =  1  (the  next  stage  of  the  convolution),  eqn  A8.3  indicates  that: 

B(l)w.(0)  +  B(0)w.(l)  =  g.(0)w.(l)  +  g.(0)A(l)w.(0)  +  g^Dw.fO)  ....(A8.4) 
Premultiplying  eqn  A8.4  by  v^(0)  (i.e.  the  i*^  dual  eigenvector  of  B(0)) 
yields  the  following  expression  for  g_.(l): 

gjd)  =  v!(0)  {B(l)  -  g.(0)  A(l)}  w.(0)  . . . .  (A8.5a) 

Furthermore,  eqn  A8.4  can  be  rewritten  in  the  following  form: 

{g.(0)I  -  B(0)}  w.(l)  =  (B( 1 )  -  g.(0)  A( 1 )  +  g.(l)I}  w.(0) 

Hence,  w^(l)  is  given  by: 

w.(l)  =  T[(0)  {B< 1 )  -  g.(0)  A(l))  +  aw.(O)  ....(A8.5b) 

where  T^(0)  is  the  commuting  g2~Penrose  inverse  of  {g^ (0)1  -  B(0))  (see  eqn 
A7.3)  and  a  is  an  arbitrary  constant  (which  may  be  set  to  zero). 

Similar  results  can  be  derived  for  each  additional  stage  of  the 
convolution  defined  by  eqn  A8.3  to  produce  a  general  algorithm,  but  the 
details  of  this  development  will  not  be  presented  here. 
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Appendix  8.2  A  Perturbation  Analysis  of  Algorithm  8.1 

Given  that  xp,  gQ  defines  a  stationary  point  of  J  (eqn  8.31)  for  which 
J  =  A  =  0,  the  following  matrices  (with  their  singular  value  decompositions) 
are  given  by: 

R  =  Y  -  U  M  =  S.  E  si  A  =  RT  R  =  E2 

o  ol2  ooo22 

D  =  U  X  =  T.  r  t! 
o  o  1  2 

where  M  and  X  are  defined  as  shown  in  Section  8.3  using  the  elements  of  x 
o  o  o 

and  g  ,  E  =  diag  {  0,  <r„,  •••  ,  a  .  ,  a  }  with  a,  <  o,  <  •  •  •  <  a, 

°  hinM  Z  ~  ■* 

V 

r  =  diag  •••  ,  y  . and  S.  and  T.  are  orthogonal  matrices  with 

N1  1  1 

q 

sT  S.  =  tT  T.  =  I  for  i  =  1,2. 

Assuming  the  initial  value  for  x  in  Algorithm  8.1  is  given  by 
Xq  =  xq+  6Xg,  eqn  8.37  can  be  used  to  develop  the  initial  estimate  for  g  as: 

g  =  {  (Dq  +  SDq)T(Do  +  SD0)}_1(Do  *  6D0)T  Y  (xq  +  6xQ) 


D  )  1  [  SdId  +  DT SDp.  J )  {g  +  (DT 
o'  1  0  o  o  0 “  l6o  o 


V'K*  V  DV  S*o» 


~  g  +  (DT  D  )  (Y  x  -  D  g  )  +  (DT  D  )_1DT  (Y  Sx  -  8D  g„) 

6o  '  o  o'  0  v  o  o  6o'  o  o'  o  v  o  o  60 


But  Yx  -Dg  =  R  x  =0  (since  x  is  a  stationary  point  for  which  J  =  0) 
o  o°o  o  o  o  J  v 

and  Y  5xq  -  5Dq  gQ  =  Rq  8xq.  Thus,  the  difference  between  g  and  gQ  is: 

Sgn  =  (DT  D  )_1DT  R  6x„  (A8.6) 

60  v  o  o  o  o  0  ' 

This  result  can  now  be  used  with  eqn  8.34  to  identify  the  error  in  the  next 
estimate  of  x  (xq+  5x^)  as  shown  here: 

(Aq  +  5Aq)  (xq  +  8Xj)  =  SAp  (xQ  +  &Xj) 

Ao  Sxj  «  («V  -  *A0)  XQ 

Sxj  =  A^fSApI  -  SA0)  xQ  . . . .  ( A8 . 7 ) 

^  ~2  T  - 

where  Aq  =  Sj  I  S2  and  I  =  diag  {  0,  l/i^i  **•  ,  l/o  )•  By  definition, 

A^  x  =0  and  so  eqn  A8.7  can  be  rewritten  as: 

00 

Sx.  =  -  A^  5An  x  . ...(A8.8) 

1  o  0  o 

Furthermore,  SAq  can  be  identified  using  the  relationship 
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A0  +  SA0  =  (Y  -  u  Mo  -  u  sm0)t  (Y  -  y  h0  -  U  SMq) 

which  implies  that  6A„  =  -  SM?L  UT  R  -  RT  U  6M„. 

0  0  o  o  0 

Remembering  that  Rq  xq  =  0  and  noting  that  SMq  xq  =  Xq  SgQ  (where  Sgg  is 

defined  by  eqn  A8.6),  Sx^  can  now  be  written  in  terms  of  Sxq  as: 

Sx,  =  A^  RT  D  (DT  D  )_1DT  R  Sxa  _ (A8.9) 

1  o  o  o  v  o  o'  O  O  0  '  ' 

Using  the  singular  value  decompositions  defined  above,  6x^  can  also  be 
written  as: 

Sx1  =  S2  isj  T1  Sj  l  S2  Sxq  - (A8.10) 

This  result  can  be  extended  to  relate  the  error  at  the  k^  iteration  to  the 
initial  error  as: 

Sxk  =  S2  L  (Sj  Tx  S1)k  l  SXq  - (A8.ll) 

Thus,  the  magnitude  of  Sx^  can  be  bounded  (using  singular  value 

inequalities)  by: 

II  8xk  II  <  Cc[s\  Ti  Sr\}k  ||  6x0  II  ....(A8.12) 

But  Tj  tJ  Sj]  <  a[Sj]  ct(T2 ]  o[Tj]  ct[ Sj 1  =  1  (A8.13) 

and  by  the  Major  Principal  Direction  Alignment  principle  [K0U5],  the 

equality  in  eqn  A8.13  holds  only  if  the  major  input  and  output  principle 

T  T  k 

directions  of  and  T^  are  aligned.  Thus  in  general,  {o[S^  T^  T^  S^]}  ->  0 

as  k  -»  ®  and  so  |  |  6xk|  |  -»  0  as  k  -»  ®. 

It  should,  however,  be  noted  that  the  actual  size  of  5xk  at  any  given 
step  in  the  iteration  is  a  function  of  a,  a2>  and  the  orientation  of  SXq  as 
implied  by  conditions  A8.ll  and  A8.12.  In  some  instances,  ||6xk||  may 
actually  be  larger  then  ||5Xq||  during  the  first  few  iterations  of  the 
algorithm.  For  problems  where  two  or  more  solutions  are  sufficiently  close 
together,  the  orientation  of  6xk  combined  with  this  amplification  in  size 
during  the  initial  iterations  may  cause  the  algorithm  to  jump  from  the 
assumed  solution,  xq,  to  another.  Under  these  special  circumstances,  the 
algorithm  will  still  converge  to  a  correct  solution,  but  it  may  not  be  the 
one  originally  assumed. 
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CHAPTER  NINE 


CONCLUSION 

Research  in  the  area  of  robust  control  system  design  during  the  past 
decade  has  produced  a  number  of  results  which  strive  to  incorporate 
information  on  system  uncertainty  into  the  analysis  and  design  of  feedback 
control  systems  in  a  systematic  manner.  The  primary  objective  of  this 
thesis  was  to  advance  towards  this  goal  by  establishing  additional 
techniques  for  use  in  the  analysis  and  design  of  multivariable  control 
systems  in  the  presence  of  uncertainty.  This  aim  has  largely  been  fulfilled 
by  the  developments  presented  in  earlier  chapters.  In  this  chapter,  the 
main  results  of  the  thesis  are  summarized  and  potential  topics  for  further 
research  are  highlighted. 

9.1  Summary 

After  the  introduction  in  Chapter  1,  a  summary  of  the  predominant 
multivariable  frequency-domain  analysis  tools  for  both  "certain"  and 
"uncertain"  systems  was  presented  in  Chapter  2  to  provide  the  foundation  and 
motivation  for  subsequent  developments.  The  problem  of  generating  an 
accurate  description  of  frequency  response  uncertainty  for  use  in  these 
analysis  techniques  was  addressed  in  Chapters  3  through  6.  A  method  was 
derived  in  Chapter  3  to  quantify  the  variability  of  the  frequency  response 
estimates  obtained  from  an  estimated  parametric  model.  Finite  weighting 
sequence  models  were  found  to  be  particularly  useful  for  this  purpose,  but 
these  models  introduce  the  need  to  identify  an  appropriate  level  of 
truncation  and,  simultaneously,  introduce  a  second  element  of  uncertainty 
(i.e.  the  bias  associated  with  the  specified  truncation).  For  these 
reasons,  the  problem  of  weighting  sequence  truncation  was  considered  next, 
and  two  new  truncation  criteria  were  proposed.  The  first  criterion 
(presented  in  Chapter  4)  was  derived  using  a  geometric  interpretation  of  the 
standard  "parameter  space"  problem  to  address  the  bias/variability  trade-off 
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by  selecting  the  truncation  which  minimizes  the  expected  distance  between 
the  true  and  estimated  frequency  response  at  a  presp^ci f i rd  individual 
frequency.  The  second  criterion  (presented  in  Chapter  5)  was  derived  using 
Akaike's  criterion  [AKA1]  as  a  starting  point  and  exploiting  special 
properties  of  weighting  sequence  models  to  establish  an  implemen table 
(rather  than  theoretical)  procedure  for  optimal  truncation.  The  implementa¬ 
tion  of  this  criterion  was  shown  to  produce  not  only  a  statistically  optimal 
level  of  truncation,  but  also  explicit  upper  bounds  on  the  frequency 
response  bias  associated  with  this  truncation.  These  bias  bounds  were 
combined  with  the  variability  information  derived  in  Chapter  3  to  establish 
a  complete  and  quantitative  description  of  frequency  response  uncertainty 
using  results  presented  in  Chapter  6.  In  addition,  techniques  to  refine  and 
tailor  this  uncertainty  description  to  the  frequency  response  characteris¬ 
tics  of  given  systems  were  proposed.  The  results  were  also  extended  to 
multivariable  systems  to  yield  the  desired  frequency  response  uncertainty 
description  in  terms  of  element -by-element  bounded  perturbations,  and  it  was 
shown  that  this  description  can  be  used  to  establish  eigenvalue  inclusion 
regions  for  use  in  the  assessment  of  robust  stability  and  performance. 

In  Chapter  7,  attention  shifted  to  the  multivariable  control  design 
problem  and,  more  specifically,  to  the  development  of  a  multivariable 
control  design  methodology  that  can  be  applied  using  on-line,  computer- 
implemented  algorithms.  A  z-domain  "characteristic  subsystem"  decomposition 
was  developed  for  multivariable  systems  to  establish  the  necessary  link 
between  the  conventional  frequency-domain  design  philosophy  of  the 
generalized-Nyquist/characteristic-locus  approach  and  the  corresponding 
discrete-time  computer  implementations  of  these  designs.  Indeed,  it  was 
shown  that  this  "characteristic  subsystem"  approach  yields  a  much  more 
accurate  implementation  for  conventional  frequency-domain  designs  than  that 
available  using  currently-existing  techniques.  More  importantly,  this 
approach  provided  the  foundation  for  extending  SISO  self-tuning  control 
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algorithms  to  the  multivariable  problem  in  precisely  the  same  manner  as  the 
characteristic  locus  design  method  generalizes  classical  frequency  response 
design  techniques.  The  derivation  of  one  such  multivariable  self-tuner  was 
presented  in  Chapter  8  using  the  SISO  GPC  algorithm  as  the  basis  for  the 
development.  The  problem  of  on-line  identification  was  also  examined  and 
two  algorithms  were  proposed  to  generate  the  required  subsystem  descriptions 
directly  from  input/output  data.  These  algorithms  offer  the  potential  for  a 
significant  reduction  in  the  computational  complexity  of  the  overall  self¬ 
tuning  algorithm,  and  thus  should  make  it  possible  to  apply  the  proposed 
multivariable  self-tuning  algorithm  to  a  large  class  of  systems  using 
existing  computer  capabilities. 

9.2  Future  Research 

The  developments  in  this  thesis,  as  with  any  research  effort,  highlight 
several  additional  and  potentially  important  topics  for  further  study.  From 
a  frequency-domain  point  of  view,  the  element-by-element  bounds  on  system 
frequency  response  uncertainty  developed  in  Chapters  3  through  6  establish  a 
precise  description  of  system  uncertainty.  This  information  should  make  it 
possible  to  enhance  the  development  of  frequency-domain  control  design 
techniques  to  handle  system  uncertainty. 

For  currently-existing  approaches  (which  handle  system  uncertainty 
implicitly  via  specified  performance  indices),  this  uncertainty  information, 
when  incorporated  into  the  performance  index  specifications  for  the  given 
methodology,  should  produce  control  designs  which  are  tailored  to  the  known 
uncertainty  characteristics  of  the  system.  Furthermore,  this  information 
should  also  make  it  possible  to  focus  on  new  design  methodologies  to  handle 
system  uncertainty  in  a  direct  and  explicit  manner.  As  pointed  out  earlier, 
tentative  steps  have  already  been  taken  towards  accomplishing  this  task 
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within  an  H  framework  ([D0Y3J,  IBIRlj),  and  the  ability  to  generate  system- 
specific  frequency  response  uncertainty  information  (as  described  in  this 
thesis)  should  provide  new  impetus  to  the  development  of  these  techniques. 
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Alternatively,  the  results  derived  here  offer  the  potential  for  the  develop¬ 
ment  of  a  robust  design  methodology  based  on  a  "characteristic-lorus-type" 
approach.  As  demonstrated  in  Chapter  6,  the  precise  element-by-element 
uncertainty  bounds  developed  here  can  be  combined  with  earlier  structured  E- 
contour  results  ([K0U4),  [K0U6J)  to  produce  accurate  gain  and  phase 
information  on  the  characteristic  loci  of  the  perturbed  system.  As  such,  it 
should  be  possible  to  produce  closed-loop  designs  which  achieve  desired 
performance  goals  (while  simultaneously  accounting  for  the  sensitivity  of 
the  characteristic  loci  to  the  specified  uncertainty)  by  manipulating  the 
eigenvalue  inclusion  bands  rather  than  the  nominal  characteristic  loci 
alone.  The  key  to  accomplishing  these  modifications  is,  of  course,  the 
ability  to  develop  a  controller  structure  which  can  be  used  to  manipulate 
the  characteristic  locus  bands.  As  yet,  this  task  has  not  received  a  great 
deal  of  attention.  But  now  that  a  precise  characteristic  locus  uncertainty 
description  can  be  generated,  it  seems  reasonable  to  anticipate  that 
investigations  in  this  area  will  produce  viable  techniques  for  the  design  of 
robust  control  systems;  thereby  extending  classical  frequency  response 
techniques  to  the  problem  of  controlling  "uncertain"  multivariable  systems. 

Provided  a  characteristic  locus  approach  to  robust  control  design  can 
be  developed,  the  ability  to  produce  tighter  characteristic  locus  inclusion 
regions  may  also  prove  beneficial.  The  currently-proposed  algorithm  for 
generating  these  regions  relies  on  a  transformation  from  input/output  data 
to  element-by-element  uncertainty  bounds  and,  ultimately,  to  E-contour 
bounds  for  the  characteristic  loci.  But  as  mentioned  previously,  this 
approach  introduces  an  element  of  conservatism  in  the  uncertainty 
description  due  to  the  fact  that  the  specified  confidence  associated  with 
the  parameter  uncertainty  ellipsoids  translates  into  a  lover  bound  on  the 
confidence  associated  with  the  corresponding  E-contours.  The  development  of 
algorithms  which  estimate  "characteristic  subsystem"  descriptions  directly 
from  input/output  data  (as  proposed  in  Chapter  8),  however,  suggests  that  it 


may  be  possible  to  by-pass  the  intermediate  steps  in  this  process  to  produce 
confidence  bounds  directly  on  each  individual  characteristic  locus  using 
SISO  results  similar  to  those  developed  in  this  thesis.  Though  the  steps  in 
the  development  of  this  uncertainty  description  are  not  immediately  obvious, 
investigations  in  this  area  could  produce  a  more  refined  description  of 
characteristic  locus  uncertainty  while  simultaneously  eliminating  the  time- 
consuming  intermediate  steps  associated  with  the  current  approach. 

The  time-domain  and  self-tuning  control  concepts  derived  in  Chapters  7 
and  8  also  highlight  several  areas  for  future  work.  For  example,  additional 
research  in  the  areas  of  branch  point  placement,  on-line  high  frequency 
alignment  compensation  and  direct  subsystem  identification  are  needed  to 
produce  a  completely  general  multivariable  self-tuning  algorithm  within  the 
proposed  characteristic  subsystem  framework.  As  highlighted  previously,  the 
location  of  branch  points  can  be  altered  using  constant  precompensation.  It 
seems  reasonable,  therefore,  to  anticipate  that  straightforward  algorithms 
can  be  derived  to  generate  simple  precompensators  which  reposition  branch 
points  to  desired  locations  in  the  z-plane.  Such  algorithms  would  not  only 
ensure  the  general  applicability  of  the  characteristic  subsystem  framework 
for  self-tuning  applications  by  eliminating  unstable  branch  points,  but  they 
could  also  be  used  to  reposition  stable  branch  points  so  as  to  increase  the 
decay  rates  of  the  CVS  and  dual  CVS.  Hence,  the  number  of  terms  needed  to 
generate  an  accurate  eigenvector  approximation  (and,  as  a  result,  the 
computational  complexity  of  the  control  algorithm)  could  be  reduced. 
Furthermore,  since  constant  precompensation  can  also  be  used  to  achieve  high 
frequency  alignment,  it  may  be  possible  to  develop  algorithms  which 
simultaneously  address  the  problem  of  reducing  high  frequency  interaction; 
thereby  improving  system  response  still  further  at  no  extra  computational 
expense.  Initial  efforts  in  this  area  [NICIJ  have  produced  some  encouraging 
results.  However,  a  great  deal  of  research  is  still  required.  As  for 
direct  subsystem  identification,  preliminary  studies  of  the  iterative 
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algorithms  presented  in  Chapter  8  have  also  yielded  promising  results. 
However,  the  proposed  algorithms  must  be  investigated  in  much  greater  detail 
to  establish  their  convergence  characteristics  (under  stochastic  conditions) 
and  to  validate  the  proposed  on-line  implementations.  Once  these  tasks  are 
accomplished,  the  resulting  algorithms  should  provide  a  useful  new  technique 
for  the  multivariable  system  identification  and  a  computationally-ef f icient 
means  of  incorporating  identification  into  the  self-tuning  design. 

In  addition  to  these  tasks,  it  must  be  stressed  that  the  characteristic 
subsystem  decomposition  for  multivariable  systems  established  in  Chapter  7 
provides  a  general  means  of  transforming  the  multivariable  control  problem 
into  a  set  of  independent  SISO  problems  that  are  compatible  with  on-line 
computer  implementations.  As  such,  the  development  of  real-time 
multivariable  algorithms  need  not  be  restricted  to  the  set  of  long-range 
predictive  control  algorithms  such  as  SISO  GPC.  Indeed,  studies  to 
incorporate  other  SISO  algorithms  into  this  characteristic  subsystem 
framework  can  be  undertaken  and  should  ultimately  lead  to  a  wide-range  of 
multivariable  self-tuning  algorithms  which  account  for  the  multivariable 
characteristics  of  the  system  in  a  true  generalized-Nyquist  sense. 
Furthermore,  the  eigenvalue  decomposition  used  to  generate  the 
characteristic  subsystems  here  suggests  the  possibility  of  establishing  an 
alternative  subsystem  description  based  on  singular  value  decompositions. 
As  singular  values  are  known  to  be  less  sensitive  to  perturbations,  a 
singular-value-based  subsystem  description  could  be  used  to  improve  the 
sensitivity  characteristics  of  the  resulting  control  algorithms  and, 
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perhaps,  to  bridge  the  gap  between  self-tuning  and,  H  control  designs. 

The  final  effort,  the  practical  application  of  the  proposed  self-tuning 
control  algorithm,  goes  without  saying.  It  is  hoped  that  future  efforts 
will  transform  the  theoretical  developments  presented  here  into  a  practical 
computer  implementation  for  the  control  of  real-life  multivariable  systems. 
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